<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Purity Kihoro</title>
    <description>The latest articles on DEV Community by Purity Kihoro (@purity_kihoro).</description>
    <link>https://dev.to/purity_kihoro</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2223061%2F41296bb6-719f-48ec-810c-4d291ab301e5.png</url>
      <title>DEV Community: Purity Kihoro</title>
      <link>https://dev.to/purity_kihoro</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/purity_kihoro"/>
    <language>en</language>
    <item>
      <title>How Analysts Translate Messy Data, DAX, and Dashboards into Action Using Power BI.</title>
      <dc:creator>Purity Kihoro</dc:creator>
      <pubDate>Mon, 09 Feb 2026 02:39:59 +0000</pubDate>
      <link>https://dev.to/purity_kihoro/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1b6f</link>
      <guid>https://dev.to/purity_kihoro/how-analysts-translate-messy-data-dax-and-dashboards-into-action-using-power-bi-1b6f</guid>
      <description>&lt;p&gt;An analyst's main goal is to spot trends, performance and communicate insights faster from provided data. This makes them seek a software that supports data cleaning, create impactful visuals and present the data in a way that actually tells a story and understand the data better. Hence, Power Bi. &lt;br&gt;
How do analysts use Power Bi?&lt;/p&gt;
&lt;h2&gt;
  
  
  1. Get the data
&lt;/h2&gt;

&lt;p&gt;First, the analyst sources for the data. They may get the data from the company’s database, it may be in Excel format or CSV or it may come from POS machines or even surveys.&lt;/p&gt;
&lt;h2&gt;
  
  
  2. Power Query in Power Bi
&lt;/h2&gt;

&lt;p&gt;They then transform the data in Power Query where they get to clean it. Clean it? Is data dirty? Yes! Raw data collected from a source is very messy. Data can be messy in a number of ways, these include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Missing records&lt;/em&gt;&lt;/strong&gt;: these may be resolved by filling in the values with unknown for text data types and null for numeric data types.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Duplicate values&lt;/em&gt;&lt;/strong&gt;: Analysts remove the duplicate rows to ensure accuracy in the data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Inconsistent data&lt;/em&gt;&lt;/strong&gt; for example, you may find a column such as city with values such as NRB and Nairobi. These values are the same but to the software, they appear as different cities. The analyst will standardize the data by restructuring all formats to be consistent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Wide tables&lt;/em&gt;&lt;/strong&gt;: The analysts Unpivot the columns to normalize the data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;em&gt;Wrong data types&lt;/em&gt;&lt;/strong&gt;: The data types of the columns are adjusted in Transform tab.&lt;/li&gt;
&lt;li&gt;Use DAX to do calculations.
In order to create insightful reports, I can use DAX functions to perform dynamic calculation on the data. FOR EXAMPLE
SUM() which adds up all the values in a row. Create a new measure.
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total sales= Sum(Sales[Amount])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;What if you want to get the expected results, multiple columns are required? Then you would use SUMX()&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total Revenue = SUMX(Sales,Sales[Quantity]*Sales[Price])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What about calculating KPIs? Then I would use Calculate()which allows me to evaluate an expression with complex filters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total Sweater sales 2024= CALCULATE(SUM(Sales[Amount]), 
Sales[Product] = "Sweater", YEAR(Sales[Date]) = 2024)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Create Visuals
&lt;/h2&gt;

&lt;p&gt;Analyst then build visualizations from the functions conducted. There are so many types of visualizations in Power Bi. You can either use reports or dashboards.&lt;br&gt;
&lt;strong&gt;Reports&lt;/strong&gt; are used for detailed analysis and can have multiple pages and visuals while &lt;strong&gt;Dashboards&lt;/strong&gt; are most used for summaries therefore only contain the most important insights.&lt;br&gt;
For the functions performed above, the most preferred would be a &lt;strong&gt;&lt;em&gt;Card&lt;/em&gt;&lt;/strong&gt;. It is used to visualize a single important values.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmz0utgqdugcfec4sy4s.JPG" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmz0utgqdugcfec4sy4s.JPG" alt=" " width="800" height="425"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Other visuals that are very popular include:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bar/Column charts&lt;/strong&gt;: These are used for comparing values in different categories.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pie/Donut charts&lt;/strong&gt;: Used for showing the percentage of a value to a whole.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slicers&lt;/strong&gt;: For filtering the visuals on the dashboard.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Map visuals&lt;/strong&gt;: Ideal for geographic analytics either filled or bubble map.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5eic2u358lkeuwvljocc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5eic2u358lkeuwvljocc.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  5. Present Findings
&lt;/h2&gt;

&lt;p&gt;An analyst then publishes their findings for executives. These can be done using Power Bi Service where a company can host their own servers securely. There is also a mobile App that allows one to view their reports on the go. &lt;/p&gt;

&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;Power Bi has proven to be a very necessary tool for every data specialist. It can assist in converting chaotic data into very compelling insights. In order to story tell using data, mastering Power Bi is nonnegotiable as a data analyst.&lt;/p&gt;

</description>
      <category>powerbi</category>
      <category>datascience</category>
      <category>analytics</category>
      <category>dashboards</category>
    </item>
    <item>
      <title>Schemas and data modelling in Power BI</title>
      <dc:creator>Purity Kihoro</dc:creator>
      <pubDate>Mon, 02 Feb 2026 14:00:49 +0000</pubDate>
      <link>https://dev.to/purity_kihoro/schemas-and-data-modelling-in-power-bi-38cj</link>
      <guid>https://dev.to/purity_kihoro/schemas-and-data-modelling-in-power-bi-38cj</guid>
      <description>&lt;h2&gt;
  
  
  &lt;strong&gt;Schema and Data Modelling.&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;This is a logical structure that defines how data is organized within a database. Database schemas provide a logical blueprint for data storage and organization, for greater user accessibility, scalability, and data integrity. This blueprint is inclusive of logical constraints such as, table names, fields, data types and the relationships between these entities. The schema does not contain the actual data itself, but rather provides the structure that the data must conform to. &lt;/p&gt;

&lt;p&gt;Schemas commonly use visual representations to communicate the architecture of the database, becoming the foundation for an organization’s data management discipline. This process is known as &lt;strong&gt;Data modelling.&lt;/strong&gt; It involves the process of modelling Database Schemas.&lt;/p&gt;

&lt;p&gt;A data model is a diagram that visually represents a conceptual framework for organizing, defining, and showing the relationships between data elements. This visual method helps clarify complex connections between various data points, simplifying the design of efficient and well-structured databases.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Most common Schemas used on Power Bi&lt;/strong&gt;&lt;br&gt;
There are 2 main schemas used in Power Bi. They include Star Schema and Snowflake Schema.&lt;/p&gt;

&lt;h2&gt;
  
  
  Star Schema
&lt;/h2&gt;

&lt;p&gt;A star schema is a type of schema where a single central fact table is surrounded by multiple-dimension tables. This single fact table contains facts of the data model, while the dimension tables contain descriptive properties or dimensions of the data model. This schema resembles a star shape. The Power BI engine works best with star schema.&lt;br&gt;
Businesses can utilize a star schema to manage and organize large datasets based on two primary principles: facts and dimensions.&lt;br&gt;
&lt;strong&gt;Facts:&lt;/strong&gt; The center of the structure and provides measurement-based pieces of data. Examples of such central facts are the number of transactions, website clicks, or total purchases. &lt;br&gt;
&lt;strong&gt;Dimension&lt;/strong&gt;: Provides additional information about the fact, such as which customer made the purchase, where they made it from, and what product they bought.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why is star preferred?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Easier to understand: Their dimensions can be used to slice and dice data and facts.&lt;/li&gt;
&lt;li&gt;Better Performance: Since they have lesser joins and shorter paths, a better performance is guaranteed.&lt;/li&gt;
&lt;li&gt;Scalable: It is easier to add new dimension tables.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7hzynm800rocaiktzchk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7hzynm800rocaiktzchk.png" alt=" " width="300" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Snowflake Schema&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A snowflake schema is a type of schema that extends the star schema by normalizing dimension tables. In this schema, the dimension tables are further broken down into sub-dimension tables, creating a more complex structure. For example, the dimension product is further divided into category and subcategory which can be seen attached to the product dimension table. &lt;br&gt;
&lt;strong&gt;Advantage&lt;/strong&gt;: Beneficial for reducing data redundancy in complex scenarios&lt;br&gt;
&lt;strong&gt;Disadvantage&lt;/strong&gt;: Slower performance due to increased table joins.&lt;/p&gt;

</description>
      <category>newbie</category>
      <category>powerbi</category>
      <category>visualization</category>
    </item>
    <item>
      <title>Introduction to Linux for Data Engineers, Beginner Friendly Approach</title>
      <dc:creator>Purity Kihoro</dc:creator>
      <pubDate>Fri, 30 Jan 2026 04:02:56 +0000</pubDate>
      <link>https://dev.to/purity_kihoro/introduction-to-linux-for-data-engineers-beginner-friendly-approach-267a</link>
      <guid>https://dev.to/purity_kihoro/introduction-to-linux-for-data-engineers-beginner-friendly-approach-267a</guid>
      <description>&lt;h2&gt;
  
  
  Why is Linux important for data engineers
&lt;/h2&gt;

&lt;p&gt;Linux is an open source Operating System that is customizable therefore; it is able to meet the specific needs of different professionals such as data engineers. It is a very efficient and secure platform to use. Data engineers deal with extracting, transforming and loading very large volumes of data. They prefer a Linux terminal for the following reasons:&lt;/p&gt;

&lt;p&gt;_1. &lt;strong&gt;Compatibility with Data Engineering Tools:&lt;/strong&gt; There are tools such as Hadoop (to store and process large data sets), Kafka (real time data streaming) and Docker (create, deploy and run container applications) all run seamlessly on Linux.&lt;br&gt;
2.** Security and Stability:** Linux is built on a promise of Security and is very reliable in handling sensitive data. Its open source nature allows it to be regularly updated with security patches by developers all around the world.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scalability and Flexibility:&lt;/strong&gt; Data Engineers work with data that is ever growing in volume, to keep up with the demand, Linux is very good at offering more processing power and speed to create workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Command Line Interface:&lt;/strong&gt; Data Engineers work on the Linux CLI as it ensures efficient, high speed processing and provides powerful automation capabilities. The CLI is also used to manage remote servers and computers using tools such as SSH._&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Basic Linux commands
&lt;/h2&gt;

&lt;p&gt;mkdir : Creates a new directory.&lt;br&gt;
cd : Changes to the specified directory.&lt;br&gt;
ls: Lists files and directories in the current directory.&lt;br&gt;
mv  : Moves or renames the source to the destination.&lt;br&gt;
cp  : Copies the source to the destination.&lt;br&gt;
rm (Remove) Deletes files and directories.&lt;br&gt;
touch : Creates an empty file or updates its modification time.&lt;br&gt;
Clear: Clears the terminal screen.&lt;br&gt;
ssh @: Connects to the remote server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Text Editors in the Linux Terminal&lt;/strong&gt; &lt;br&gt;
There are 2 editors available: Vi and Nano&lt;br&gt;
&lt;strong&gt;&lt;em&gt;Vi&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
Vi is a text editor that divides the editing process into different modes.  It has 3 Key modes. They include: This modal approach allows for fast and efficient text manipulation, making it a favorite for many seasoned developers and engineers.&lt;/p&gt;

</description>
      <category>linux</category>
      <category>dataengineering</category>
      <category>datascience</category>
    </item>
    <item>
      <title>What concepts should I master to be a Data Engineer?</title>
      <dc:creator>Purity Kihoro</dc:creator>
      <pubDate>Sun, 10 Aug 2025 21:02:08 +0000</pubDate>
      <link>https://dev.to/purity_kihoro/what-concepts-should-i-master-to-be-a-data-engineer-1chp</link>
      <guid>https://dev.to/purity_kihoro/what-concepts-should-i-master-to-be-a-data-engineer-1chp</guid>
      <description>&lt;p&gt;As a new data engineering student, there are a number of concepts that you need to grasp. The concepts will guide you in knowing exactly what to learn in respect to data engineering. So create a notion page and gather all resources available to be able to track your progress while learning.&lt;br&gt;
&lt;strong&gt;i)    Batch Verses Streaming Ingestion.&lt;/strong&gt;&lt;br&gt;
A Data Engineer implements the ETL (Extract Transform and Load) process for their organizations. In extracting the data, they should have identified the sources of these data and have a procedure for collecting the data. Data ingestion is the procedure that the data engineer takes to collect the data from all the different sources and organize it in a way that it can be processed for their specific organization.&lt;br&gt;
Batch Ingestion is when the data is collected over a period of time for example, a minute, a week or a month, once it has all being gathered, it is then processed all together at the same time. It is suitable for when dealing with very large datasets. For example collecting all the sales data of an Ecommerce store after a day.&lt;br&gt;
Stream ingestion is when data is processed as soon as it is collected. In stream ingestion, the data is processed instantly. It is highly recommended for critical data that requires immediate decision making. For example, you can use stream ingestion when dealing with fraud detection systems to identify the fraud as soon as it happens.&lt;br&gt;
&lt;strong&gt;ii)   (CDC) Change Data Capture&lt;/strong&gt;&lt;br&gt;
This is a technique used to ensure that all the records in a database are synchronized across the entire database in real-time. If and when a change is made to a record in a database, then these changes are integrated across the entire database resulting in data with low latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;iii)  Idempotency&lt;/strong&gt;&lt;br&gt;
This is the ability to ensure that if a process is repeated several times, the results do not change. A data engineer performs the ETL process where the data extracted may be from csv flles. Implementing idempotency ensures that no matter how many times the same data is loaded, it does not change the output by producing duplicates, instead it is able to identify that the data has already being loaded before to avoid data inconsistencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;iv)   OLAP VS OLTP&lt;/strong&gt;&lt;br&gt;
OLAP (Online Analytical Processing) is a data processing system that does multidimensional analysis on large amounts of raw data at high speeds. It is best uses for analytical reporting such as financial analysis or forecasting future sales.&lt;br&gt;
OLTP (Online Transactional Processing) is the process that enables most online transactions that are recorded into a database.  These transactions are recorded in real time and can be performed by multiple users concurrently. Such transactions include online bank transactions and flight booking&lt;br&gt;
Both OLAP and OLTP are used together by most organizations as they both contribute very necessary data required for the growth of the organization.&lt;br&gt;
&lt;strong&gt;v)     Columnar vs Row-based Storage&lt;/strong&gt;&lt;br&gt;
Columnar databases organize data by the fields making it easier for calculations and also aggregation of the data. It allows for efficient data retrieval and analysis, as it only pulls the required data.&lt;br&gt;
Row based storage database organize data by rows making it easier for dealing with complex queries. It is used mostly for transactional systems that require frequent querying using the CRUD (Create, Read, Update and Delete) operations.&lt;br&gt;
&lt;strong&gt;vi)   Data Partitioning&lt;/strong&gt;&lt;br&gt;
This is the process of taking large volumes of data and subdividing it into smaller, more manageable datasets by using a suitable criteria. The criteria may be a column. The smaller datasets created are known as partitions. This process allows for efficient data filtering. Identifying the right column to be used for the partitioning is very key, it is recommended to use one with distinct values.&lt;br&gt;
&lt;strong&gt;vii)  ETL VS ELT&lt;/strong&gt;&lt;br&gt;
E-T-L is the major job of a data engineer. It means Extract, Transform and Load. Extracting is the process of process of getting the data from different sources. Transforming involves enriching the data by manipulating the field names. Loading is placing the transformed data into the tools used by the data analysts.&lt;br&gt;
In ETL, the data that is in file storage is transformed using SQL and loaded into the DBMS used by the analysts while in ELT the data is loaded into a data warehouse and then transformed using SQL into the DBMS used by the analyst.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>MASTERING DATA ANALYTICS: The Ultimate Guide To Data Analytics</title>
      <dc:creator>Purity Kihoro</dc:creator>
      <pubDate>Thu, 17 Oct 2024 03:56:48 +0000</pubDate>
      <link>https://dev.to/purity_kihoro/mastering-data-analytics-the-ultimate-guide-to-data-analytics-1hhe</link>
      <guid>https://dev.to/purity_kihoro/mastering-data-analytics-the-ultimate-guide-to-data-analytics-1hhe</guid>
      <description>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before explaining the techniques and tools that are involved in data analytics, first we need to understand what data analytics is all about. A data analyst is the person that conducts the data analytics of a company. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is data analytics?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the process that involves collecting data from a specific field or subject, cleaning and transforming the data to visualizations that can be used to come up with solutions to problems within the given field. &lt;/p&gt;

&lt;p&gt;The process involves identifying trends and patterns within the data for example if analyzing data for an ecommerce site, one can derive the time when most customers are on the site.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who is a data analyst?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the professional who conducts the collection of data, cleans the data and finally transforms the data using a number of techniques and tools. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tools used in data analytics&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;There are a number of tools that a data analyst has to master. They include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Spreadsheets&lt;/strong&gt; i.e &lt;em&gt;Excel&lt;/em&gt; or &lt;em&gt;Google Sheet&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visualization tools&lt;/strong&gt; such as &lt;em&gt;Tableau&lt;/em&gt; and &lt;em&gt;Power BI&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Querying tools&lt;/strong&gt; such as &lt;em&gt;Python&lt;/em&gt;, &lt;em&gt;SQL&lt;/em&gt; and &lt;em&gt;R&lt;/em&gt; &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Big data processing tools&lt;/strong&gt; such as &lt;em&gt;Hive&lt;/em&gt; and &lt;em&gt;Hadoop&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Access and extraction tools&lt;/strong&gt; such as &lt;em&gt;Data Lakes&lt;/em&gt;, &lt;em&gt;Data&lt;/em&gt; &lt;em&gt;Pipelines _and _Data Warehouses&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;It is important to master at least one tool for a specific task instead of learning all the tools that conduct the same task. For example, under the spreadsheets, master only one either Excel or Google Sheets. This is because the principles found under one tool are likely the same for the other tool.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fundamental skills of a data analyst&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Statistical skills&lt;/strong&gt;. Analysis involves finding relationship between data, therefore mean, correlation and mode of data is calculated.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analytical skills&lt;/strong&gt;. An analyst should have the ability to look at a dataset and figure out possible queries to create that are relevant to it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data visualization skills&lt;/strong&gt;. While presenting their findings, an analyst should know the best type of graph to use in order to present the data in the most appealing format&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Problem solving skills&lt;/strong&gt;. After analyzing results from data, and analyst should have the ability to figure out the problems that can be solved using its findings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Project management skills&lt;/strong&gt;. An analyst should be organized in the whole process of gathering data to the point of presentation. Every step should be well documented to reveal the accuracy of the data.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>analytics</category>
      <category>data</category>
      <category>analyst</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
