<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shirley Jessy</title>
    <description>The latest articles on DEV Community by Shirley Jessy (@kerubo).</description>
    <link>https://dev.to/kerubo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2142432%2F3696ebe0-b639-480b-a59b-9caca74a9bf3.png</url>
      <title>DEV Community: Shirley Jessy</title>
      <link>https://dev.to/kerubo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kerubo"/>
    <language>en</language>
    <item>
      <title>Introduction to python for data analysis</title>
      <dc:creator>Shirley Jessy</dc:creator>
      <pubDate>Mon, 07 Oct 2024 13:13:14 +0000</pubDate>
      <link>https://dev.to/kerubo/introduction-to-python-for-data-analysis-252o</link>
      <guid>https://dev.to/kerubo/introduction-to-python-for-data-analysis-252o</guid>
      <description>&lt;p&gt;&lt;strong&gt;What is Python?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Python is a popular programming language. It was created by Guido van Rossum, and released in 1991.&lt;/p&gt;

&lt;p&gt;It is used for:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;/li&gt;
&lt;li&gt;web development (server-side),&lt;/li&gt;
&lt;li&gt;software development,&lt;/li&gt;
&lt;li&gt;mathematics,&lt;/li&gt;
&lt;li&gt;system scripting.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;*&lt;em&gt;What can Python do?&lt;br&gt;
*&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python can be used on a server to create web applications.&lt;/li&gt;
&lt;li&gt;Python can be used alongside software to create workflows.&lt;/li&gt;
&lt;li&gt;Python can connect to database systems. It can also read and modify files.&lt;/li&gt;
&lt;li&gt;Python can be used to handle big data and perform complex mathematics.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Python can be used for rapid prototyping, or for production-ready software development&lt;br&gt;
.&lt;br&gt;
&lt;strong&gt;Why Python?&lt;/strong&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Python has a simple syntax similar to the English language.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Python has syntax that allows developers to write programs with fewer lines than some other programming languages.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Python runs on an interpreter system, meaning that code can be executed as soon as it is written. This means that prototyping can be very quick.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Python can be treated in a procedural way, an object-oriented way or a functional way.&lt;br&gt;
**&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Use Python for Data Analysis?
&lt;/h2&gt;

&lt;p&gt;**&lt;br&gt;
Ease of Learning: Python’s syntax is clear and intuitive, making it accessible for beginners.&lt;/p&gt;

&lt;p&gt;Rich Libraries: Python offers powerful libraries specifically designed for data analysis, such as:&lt;/p&gt;

&lt;p&gt;Pandas: For data manipulation and analysis.&lt;br&gt;
NumPy: For numerical computations.&lt;br&gt;
Matplotlib &amp;amp; Seaborn: For data visualization.&lt;br&gt;
SciPy: For scientific and technical computing.&lt;br&gt;
Statsmodels: For statistical modeling.&lt;br&gt;
Community and Resources: A large community means plenty of resources, tutorials, and forums for support.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Libraries for Data Analysis&lt;/strong&gt;&lt;br&gt;
Pandas&lt;/p&gt;

&lt;p&gt;Used for data manipulation and analysis.&lt;br&gt;
Offers data structures like DataFrames and Series, which simplify handling and analyzing structured data.&lt;br&gt;
Common operations include filtering, grouping, aggregating, and merging datasets.&lt;br&gt;
python&lt;br&gt;
Copy code&lt;br&gt;
import pandas as pd&lt;/p&gt;

&lt;h1&gt;
  
  
  Load a dataset
&lt;/h1&gt;

&lt;p&gt;df = pd.read_csv('data.csv')&lt;/p&gt;

&lt;h1&gt;
  
  
  Display the first few rows
&lt;/h1&gt;

&lt;p&gt;print(df.head())&lt;br&gt;
NumPy&lt;/p&gt;

&lt;p&gt;Provides support for large, multi-dimensional arrays and matrices.&lt;br&gt;
Offers mathematical functions to operate on these arrays.&lt;br&gt;
python&lt;br&gt;
Copy code&lt;br&gt;
import numpy as np&lt;/p&gt;

&lt;h1&gt;
  
  
  Create a NumPy array
&lt;/h1&gt;

&lt;p&gt;array = np.array([1, 2, 3, 4])&lt;br&gt;
Matplotlib &amp;amp; Seaborn&lt;/p&gt;

&lt;p&gt;Matplotlib: The foundational library for creating static, interactive, and animated visualizations in Python.&lt;br&gt;
Seaborn: Built on top of Matplotlib, it provides a higher-level interface for drawing attractive statistical graphics.&lt;br&gt;
python&lt;br&gt;
Copy code&lt;br&gt;
import matplotlib.pyplot as plt&lt;br&gt;
import seaborn as sns&lt;/p&gt;

&lt;h1&gt;
  
  
  Create a simple line plot
&lt;/h1&gt;

&lt;p&gt;plt.plot(df['column1'], df['column2'])&lt;br&gt;
plt.show()&lt;br&gt;
SciPy&lt;/p&gt;

&lt;p&gt;Built on NumPy, it provides additional functionality for optimization, integration, interpolation, eigenvalue problems, and other advanced mathematical computations.&lt;br&gt;
Statsmodels&lt;br&gt;
**&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Useful for statistical modeling and hypothesis testing.&lt;/strong&gt;&lt;br&gt;
**&lt;br&gt;
Provides tools for regression analysis, time series analysis, and more.&lt;br&gt;
Basic Data Analysis Workflow&lt;br&gt;
Data Collection: Gather data from various sources, such as CSV files, databases, or web scraping.&lt;br&gt;
Data Cleaning: Handle missing values, duplicates, and inconsistencies.&lt;br&gt;
Exploratory Data Analysis (EDA): Analyze the data through summary statistics and visualizations to understand its structure and patterns.&lt;br&gt;
Data Manipulation: Transform the data as needed for analysis (e.g., filtering, aggregating).&lt;br&gt;
Modeling: Apply statistical or machine learning models to derive insights or make predictions.&lt;br&gt;
Visualization: Create plots to effectively communicate findings.&lt;br&gt;
Reporting: Summarize results in a clear format for stakeholders.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Python's robust ecosystem makes it an excellent choice for data analysis. By leveraging libraries like Pandas, NumPy, Matplotlib, and others, you can efficiently manipulate, analyze, and visualize data. Whether you're a beginner or an experienced analyst, mastering Python will enhance your ability to derive insights from data.&lt;/p&gt;

</description>
      <category>data</category>
      <category>analytics</category>
      <category>beginners</category>
      <category>python</category>
    </item>
    <item>
      <title>Introduction to sql for data analytics.</title>
      <dc:creator>Shirley Jessy</dc:creator>
      <pubDate>Sun, 29 Sep 2024 19:12:45 +0000</pubDate>
      <link>https://dev.to/kerubo/introduction-to-sql-for-data-analytics-5dge</link>
      <guid>https://dev.to/kerubo/introduction-to-sql-for-data-analytics-5dge</guid>
      <description>&lt;p&gt;Structured Query Language (SQL) is a tool widely used for data management and analysis in databases. It is one of the most important tool for data analytics, SQL allows analysts to efficiently write, manipulate, and analyze data to arrive at a decision-making. &lt;/p&gt;

&lt;p&gt;&lt;u&gt;&lt;em&gt;&lt;strong&gt;It allows users to perform various operations, including&lt;/strong&gt;&lt;/em&gt;&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;Data Retrieval: Extracting specific data from large datasets.&lt;br&gt;
Data Manipulation: Inserting, updating, and deleting data.&lt;br&gt;
Data Definition: Creating and modifying database structures (tables, indexes).&lt;br&gt;
Data Control: Managing access to data through permissions.&lt;/p&gt;

&lt;p&gt;&lt;u&gt;&lt;strong&gt;Why Use SQL for Data Analytics?&lt;/strong&gt;&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;Versatility: SQL works well with various data analysis tools, such as Python, R, Tableau, and Excel, allowing for seamless integration.&lt;/p&gt;

&lt;p&gt;Ease of Learning: With a relatively straightforward syntax, SQL can be learned quickly, making it accessible to analysts without a programming background.&lt;/p&gt;

&lt;p&gt;Data Accessibility: SQL provides a simple way to access and manipulate data stored in relational databases, which are commonly used in many organizations.&lt;/p&gt;

&lt;p&gt;Efficiency: SQL is optimized for handling large datasets, making it an efficient tool for querying and aggregating data.&lt;br&gt;
Key SQL Concepts for Data Analytics&lt;br&gt;
To effectively use SQL for data analytics, it's essential to understand some core concepts:&lt;/p&gt;

&lt;p&gt;**&lt;u&gt; Basic concepts used in mysql.&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;NOTE--that mysql allows for connections of many to many,one to many among others.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Tables&lt;br&gt;
Data in relational databases is organized into tables, which consist of rows and columns. Each table represents a specific entity (e.g., world, city, countries) and each row is related  to a record.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Basic SQL Commands&lt;br&gt;
Here are some most important SQL commands crucial for data analytics:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;SELECT: Retrieves data from one or more tables.&lt;/p&gt;

&lt;p&gt;SELECT column1, column2 FROM table_name;&lt;br&gt;
WHERE: Filters records based on specific conditions.&lt;/p&gt;

&lt;p&gt;SELECT * FROM table_name WHERE condition;&lt;br&gt;
GROUP BY: Groups rows that have the same values in specified columns.&lt;/p&gt;

&lt;p&gt;SELECT column1, COUNT(*) FROM table_name GROUP BY column1;&lt;br&gt;
ORDER BY: Sorts the result set in ascending or descending order.&lt;/p&gt;

&lt;p&gt;SELECT * FROM table_name ORDER BY column1 ASC;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Aggregation Functions
SQL provides several functions to perform calculations on data:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;COUNT(): Counts the number of rows.&lt;br&gt;
SUM(): Calculates the total of a numeric column.&lt;br&gt;
AVG(): Computes the average value of a numeric column.&lt;br&gt;
MIN() and MAX(): Find the smallest and largest values, respectively.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Descriptive Analytics&lt;br&gt;
SQL is commonly used to summarize historical data and generate reports that highlight trends and patterns. Analysts can use SQL queries to analyze sales data, customer behavior, and more.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Data Cleaning and Preparation&lt;br&gt;
Before analysis, data often needs to be cleaned and transformed. SQL provides commands to handle missing values, duplicates, and incorrect data formats, making it easier to prepare data for analysis.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Predictive Analytics&lt;br&gt;
While SQL is primarily used for data retrieval and manipulation, it can support predictive modeling by preparing and aggregating data for use in statistical analysis tools.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Visualization Preparation&lt;br&gt;
SQL queries can be used to aggregate and prepare data for visualization in tools like Tableau or Power BI, making it easier to create meaningful visual representations of data.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Conclusion&lt;/p&gt;

&lt;p&gt;SQL is a key skill for anyone involved in data analytics. Its ability to efficiently retrieve, manipulate, and analyze data makes it invaluable in various analytical tasks. By understanding SQL, analysts can unlock deeper insights and drive informed decision-making based on data. Whether you are working with small datasets or large databases, SQL remains the key to transforming raw data into  more meaning data.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>sql</category>
      <category>data</category>
      <category>analytics</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
