<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: ptah</title>
    <description>The latest articles on DEV Community by ptah (@ptah).</description>
    <link>https://dev.to/ptah</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1046521%2Fb3cb8b90-cd0f-435b-8845-f3853fb54a23.png</url>
      <title>DEV Community: ptah</title>
      <link>https://dev.to/ptah</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/ptah"/>
    <language>en</language>
    <item>
      <title>Exploratory Data Analysis-Data Visualization Techniques</title>
      <dc:creator>ptah</dc:creator>
      <pubDate>Fri, 06 Oct 2023 11:08:46 +0000</pubDate>
      <link>https://dev.to/ptah/exploratory-data-analysis-data-visualization-techniques-aki</link>
      <guid>https://dev.to/ptah/exploratory-data-analysis-data-visualization-techniques-aki</guid>
      <description>&lt;h2&gt;
  
  
  unlocking Insights through the power of Visualization
&lt;/h2&gt;

&lt;p&gt;Hello there.....This week we'll dive deep into the world of Exploratory Data Analysis(EDA), using Data Visualization Techniques. Get ready to uncover hidden patterns in data, gain insights, and make informed decisions like never before.&lt;br&gt;
*&lt;em&gt;what exactly is EDA (Exploratory Data Analysis)? *&lt;/em&gt;&lt;br&gt;
EDA is the first and the most crucial step in data analysis. It's where  whereby we take into action, a "Detective's" approach to unravel the story behind a particular data, and what is the best way to tell a story, other than through its visuals?&lt;/p&gt;
&lt;h2&gt;
  
  
  POWER OF DATA VISUALIZATION
&lt;/h2&gt;

&lt;p&gt;In the Data world, Visualizations act as the storytellers. They bring Data to life, by making it accessible and understandable-and especially to people not adept to the tech world. Data visualizations, either through shapes, graphs, patterns or colors, provide a better way in which we can perceive information. We are going to briefly look at some popular visualization techniques/types:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Histograms
They are a fantastic way for understanding the distribution of numerical data. They illustrate the distribution of data, making it easy to spot trends and outliers.&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;code example for generating a histogram&lt;br&gt;
&lt;/p&gt;


&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import matplotlib.pyplot as plt
import numpy as np

data = np.random.randn(1000)### use your dataset
plt.hist(data, bins=20, color='skyblue', edgecolor='black')
plt.title('Histogram Example')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Scatter plots&lt;/strong&gt;&lt;br&gt;
This is a perfect  way of visualizing the relationship between two numerical variables. in essence that they are positively  or negatively correlated ,or not correlated at all , with the highest value being 0.99 which means they are highly correlated.&lt;br&gt;
&lt;code&gt;import matplotlib.pyplot as plt&lt;br&gt;
import numpy as np&lt;br&gt;
x=&lt;br&gt;
y=&lt;br&gt;
plt.scatter(x, y, color="green", marker="o")&lt;br&gt;
plt.title("")   ###Insert your title&lt;br&gt;
plt.xlabel("")   ###the x variable&lt;br&gt;
plt.ylabel("")    ##y-axis label&lt;br&gt;
plt.show()&lt;br&gt;
&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Bar charts&lt;/strong&gt;&lt;br&gt;
 Bar charts are an excellent way for comparing categorical data.  enabling you to easily identify  which categories are most prevalent and also identify trends or anomalities.&lt;br&gt;
`import matplotlib.pyplot as plt&lt;/p&gt;

&lt;p&gt;categories = ['Category A', 'Category B', 'Category C']&lt;br&gt;
values = [10, 20, 15]&lt;br&gt;
plt.bar(categories, values, color='purple')&lt;br&gt;
plt.title('Bar Chart Example')&lt;br&gt;
plt.xlabel('Category')&lt;br&gt;
plt.ylabel('Value')&lt;br&gt;
plt.show()&lt;br&gt;
`&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Box plots&lt;/strong&gt;&lt;br&gt;
Box plots generally provide a visual summary of the distribution of data, displaying the median, quartiles and outliers, allowing you to grasp data variability from a glance.&lt;br&gt;
`import seaborn as sns&lt;br&gt;
import matplotlib.pyplot as plt&lt;/p&gt;

&lt;p&gt;data = sns.load_dataset('')&lt;br&gt;
sns.boxplot(x='', y='', df=data)&lt;br&gt;
plt.title('Box Plot Example')&lt;br&gt;
plt.xlabel('Species')&lt;br&gt;
plt.ylabel('Sepal Length')&lt;br&gt;
plt.show()&lt;br&gt;
&lt;code&gt;&lt;br&gt;
**5.Heatmaps**&lt;br&gt;
Heatmaps are an excellent way of revealing  patterns and insights in large datasets. it invovles showing the correlation between variables, whereby the intensity of the color shows the strength of the relationship.&lt;br&gt;
&lt;/code&gt;import seaborn as sns&lt;br&gt;
import matplotlib.pyplot as plt&lt;/p&gt;

&lt;p&gt;data = sns.load_dataset('')&lt;br&gt;
pivot_data = data.pivot('', '', '')&lt;br&gt;
sns.heatmap(pivot_data, cmap='YlGnBu')&lt;br&gt;
plt.title('Heatmap Example')&lt;br&gt;
plt.xlabel('')&lt;br&gt;
plt.ylabel('')&lt;br&gt;
plt.show()&lt;br&gt;
`&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's dive into some  examples
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;sales analysis&lt;/strong&gt;&lt;br&gt;
Imagine you are working for a retail company, and you have been issued with the sale data for the past one year. What are your goals?- To Identify  sales trend and identify areas of improvement right?&lt;br&gt;
In this instance, you can use Line charts, to visualize the monthly sales trends, and in this you can answer questions such as; are there seasonal fluctuations? month with peak sales among others.&lt;br&gt;
&lt;strong&gt;Medical Research&lt;/strong&gt;&lt;br&gt;
Imagine a scenario whereby you are a medical researcher, analyzing medical data. You can leverage box plots, in comparing the distribution of something like cholesterol levels between different patient groups.&lt;br&gt;
&lt;strong&gt;Customer segmentation&lt;/strong&gt;&lt;br&gt;
Imagine in a scenario whereby you are part of a marketing team, and you are issued with a task to segment customers based on their behaviors. In an instance like this, scatter plots can help you understand better the relationship between different variables such as  the purchase frequency and the average spending per visit&lt;br&gt;
&lt;em&gt;EDA doesn't just stop at there. through EDA you can interpret data way better. Ask questions, make hypothesis and let the data guide you to discovering other things. It's through EDA that you can identify whether your data deserve further investigation. That's where the real adventure begins&lt;/em&gt;&lt;br&gt;
&lt;strong&gt;wrapping up&lt;/strong&gt;&lt;br&gt;
In  EDA- through visualization is where you uncover the hidden stories within your data, through this visualization, you are well prepared to make informed decisions, solve complex-problems and share your insights.&lt;br&gt;
&lt;em&gt;Stay Tuned as we tackle this journey of Data science&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--N-zQgIju--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2eshax9qr92d4bhl6wjc.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--N-zQgIju--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/2eshax9qr92d4bhl6wjc.jpg" alt="Image of a sales team  discussing" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Comprehensive Guide: Data Science Roadmap for Beginners in 2023-2024</title>
      <dc:creator>ptah</dc:creator>
      <pubDate>Sun, 01 Oct 2023 10:54:25 +0000</pubDate>
      <link>https://dev.to/ptah/comprehensive-guide-data-science-roadmap-for-beginners-in-2023-2024-416n</link>
      <guid>https://dev.to/ptah/comprehensive-guide-data-science-roadmap-for-beginners-in-2023-2024-416n</guid>
      <description>&lt;p&gt;You may have heard the term data science, but what is data science actually, and why the hype around it?&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Basically data science in simple terms, is all about collecting, analyzing and using data to gain meaningful insights, to solve problems and make Data-Driven decisions. Think of it as being a detective for the digital age.&lt;br&gt;
With the constant ever evolving world of technology, there are huge amounts of data being collected every single second, and thus data science remains an exciting field to venture into. Whether you're a beginner in Tech, a recent graduate or just looking to diversify your career. Here's a basic roadmap you can utilize to become a pro data scientist and emerge the best in the year 2024.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Start with the basics.&lt;/strong&gt;&lt;br&gt;
It's said Rome was'nt build in a single day, you can't just dive into data science. Becoming a good Data Scientist is a process and you have to trust and believe in the process.&lt;br&gt;
Begin your journey by understanding the fundamentals. Learn programming languages such as python and R, which are essential for data manipulation and analysis, familiarize yourself with mathematics, statistics, linear algebra among others as they act as the building block for data science&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.Dive into Data&lt;/strong&gt;&lt;br&gt;
Now that you have familiarized yourself with some of the tools, you could try working wit data. Explore data collection, cleaning and visualization. look into tools such as pandas, numpy, seaborn and matplotlib to analyze and present data effectively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3.Learn Data tools&lt;/strong&gt;&lt;br&gt;
The data science landscape is constantly evolving. stay updated with the latest tools and technologies such as, jupyter and kaggle  notebooks, SQL and databases, cloud platforms such as AWS, Google Cloud and Azure. In short be ahead of others.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4.Mastery of machine learning&lt;/strong&gt;.&lt;br&gt;
Now that we can comfortably perform data cleaning and analysis. It's time to dive into machine learning. In my understanding, machine learning is teaching computers to learn from examples(in this case the cleaned data) and make decisions on their own.&lt;br&gt;
Learn all about machine learning algorithms. look at supervised and unsupervised learning, deep learning and so on. Familiarize yourself with libraries such as Scikit-Learn and TensorFlow to build predictive models and neural networks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5.Try your hand at Practical Projects.&lt;/strong&gt;&lt;br&gt;
I personally would recommend this approach. like in my case, I learn better by doing something practically. Try to apply your knowledge through real-world projects. As you do this, create a portfolio showcasing your skills, building practical solutions, it could be predicting stock prices to analyzing social media sentiments or analyzing customer reviews.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6.Network and join a community.&lt;/strong&gt;&lt;br&gt;
Networking helps you learn from others, staying motivated, support and guidance and can help you discover job opportunities. I advocate that you join data science or data related communities, attend meetups and partcipate and engage in online forums.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7.Specialize&lt;/strong&gt;&lt;br&gt;
Data is a very vast field, ranging from data analysis, all the way to machine learning and AI. And now that by now you are familiar with most concepts. Identify your interests and specialize in areas such as natural language processing(NLP), big data or computer vision. This enhances your expertise and career prospects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8.Online courses and Certifications.&lt;/strong&gt;&lt;br&gt;
Now that you are familiar with the world of data science. consider enrolling in online courses and obtaining certifications. These certifications validate your skills and make your resume stand out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;9.Stay updated.&lt;/strong&gt;&lt;br&gt;
Once again, you need to stay updated, in the ever evolving world of Data. Subscribe to blogs, journals, podcasts and youtube channels to stay updated with the latest trends, research and best practices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;10.Job Hunt&lt;/strong&gt;&lt;br&gt;
Finally now you are ready for a job hunt. Leverage your skills, portfolio, certifications and networking connections to land an internship or entry-level data science role. Also remember that continuous learning is a key to success in this field.&lt;/p&gt;

&lt;p&gt;This is basically the approach that I Have and am using, and will keep you posted on the journey. By following this roadmap and staying dedicated, you can embark on a rewarding journey into the world of data science. Good Luck and All the Best in this incredible journey.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--Qr_skRy5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/di7nhgz57c3hoadxaz8g.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--Qr_skRy5--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/di7nhgz57c3hoadxaz8g.jpg" alt="Image showing the winner in a race" width="800" height="422"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
    </item>
    <item>
      <title>INTRODUCTION TO PYTHON: FOR DATA science</title>
      <dc:creator>ptah</dc:creator>
      <pubDate>Mon, 20 Mar 2023 14:38:05 +0000</pubDate>
      <link>https://dev.to/ptah/introduction-to-python-for-data-science-5bdj</link>
      <guid>https://dev.to/ptah/introduction-to-python-for-data-science-5bdj</guid>
      <description>&lt;p&gt;INTRODUCTION&lt;br&gt;
What is python?&lt;br&gt;
Python is a very popular object-oriented, interactive, high-level and general-purpose interpreted programming language.  The language has gained popularities in the recent years in areas pertaining data science, analytics, machine learning and web development. In this article we will discuss python for data science(basics).&lt;/p&gt;

&lt;p&gt;What is Data science?&lt;br&gt;
This is the process of deriving knowledge and insights from a huge and diverse set of data, enabled by a process of carefully organizing, processing and analysing the data.&lt;br&gt;
    Can be simplified as-the study of a given data to extract meaningful insights for business, using modern tools to derive meaningful information.&lt;br&gt;
    Involves mathematical and statistical modelling, data extraction  and applying data visualization tools and techniques.&lt;/p&gt;

&lt;p&gt;Where is Data science applied?&lt;br&gt;
Data science aids various industries by figuring out solutions to problems by linking similar data for future use.&lt;br&gt;
Among the industries include;&lt;br&gt;
  1)Health Care&lt;br&gt;
  2)Financial Risk Management&lt;br&gt;
  3)Energy sector&lt;br&gt;
  4)Computer vision&lt;br&gt;
  5)Transport industry&lt;/p&gt;

&lt;p&gt;Why you should python for Data science?&lt;br&gt;
a)Python is very versatile language&lt;br&gt;
b)Easy to learn &lt;br&gt;
c)Has many libraries that make it possible to perform complex tasks with just a few simple lines of code.&lt;br&gt;
d)It's open source thus free to use and modify.&lt;br&gt;
e)Well-supported with a community.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;which tools and libraries should I use?&lt;/strong&gt;&lt;br&gt;
In the real world, data comes in all forms and shapes, often raw data whereby data wrangling is applied, thus it's the primary job for a data science/scientist to be able to analyze the data.&lt;br&gt;
  It's usually challenging to process, clean and transform the data so as to be able to analyze and model it so as to create insights.&lt;br&gt;
  Python as a language for performing data science, comes with maNY OPEN-SOURCE  libraries to aid in performing all of these tasks. Among the libraries are; pandas, numpy and matplotlib.&lt;br&gt;
  There is'nt much need to learn this tools as long as you can be able to organize and clean your data, apply some mathematical formulae, run statistical equations. You may also need to learn how to import python modules.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;JUPYTER NOTEBOOK&lt;/em&gt;&lt;br&gt;
 Is a web-based interactive computing platform, that combines live code, equations, narrative text and visualizations to enrich functionality.&lt;br&gt;
 Allows one to code and collaborate with other data scientists using a web browser.&lt;br&gt;
 An incredible tool for developing and presenting data science projects . Allowing you to integrate code, its output into a single document, combining visualization, mathematical formula and explanations.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Pandas&lt;/em&gt;&lt;br&gt;
Pandas as an essential tool for every data scientist, allows you to clean and massage your data but also be able to analyze it.&lt;br&gt;
You can also be able to load the data from various data sources which may be in form of; CSV files, Excel, Databases among others.&lt;br&gt;
  Contains a variety of functions for imports, export, indexing and data manipulation.&lt;br&gt;
  Pandas also provides handy data structures  such as; Dataframes and series( 1 Dimensional array) and most efficient methods for handling them.&lt;br&gt;
 Can be used to reshape, merge, split and aggregate data.&lt;br&gt;
There are multiple courses on Udemy, Datacamp and Youtube on data science with python and pandas.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;NUMPY&lt;/em&gt;&lt;br&gt;
 A python library that provides a simple yet powerful data structure known as the n-dimensional array. Aims to provide an array that is 50* faster than traditional python lists, providing  a lot of supporting functions that makes working ndarray very easy. This tools are used in data science where speed and resources are very important.&lt;br&gt;
  Can perform mathematical and logical operations on arrays and has a variety of useful capabilities for matrices as well.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MATPLOTLIB&lt;/em&gt; &lt;br&gt;
 A low-level python library used to perform data visualization, used to communicate the findings of a data analysis project through use of graphs and visualization.&lt;br&gt;
 Acts productively with data arrays and frames,by regarding aces and figures as objects.&lt;br&gt;
 Well and more customizable and pairs well with pandas and numpy for data analysis.&lt;/p&gt;

&lt;p&gt;That's all for now, see you on next article where I will be covering Data analysis. &lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
