<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Raj Tiwari</title>
    <description>The latest articles on DEV Community by Raj Tiwari (@raj_tiwari_f987064d2f1827).</description>
    <link>https://dev.to/raj_tiwari_f987064d2f1827</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1503803%2F504bc005-961a-437b-a399-0f66b966e0c0.jpg</url>
      <title>DEV Community: Raj Tiwari</title>
      <link>https://dev.to/raj_tiwari_f987064d2f1827</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/raj_tiwari_f987064d2f1827"/>
    <language>en</language>
    <item>
      <title>Transforming Data with Logs: From Chaos to Clarity</title>
      <dc:creator>Raj Tiwari</dc:creator>
      <pubDate>Fri, 25 Jul 2025 08:08:55 +0000</pubDate>
      <link>https://dev.to/raj_tiwari_f987064d2f1827/transforming-data-with-logs-from-chaos-to-clarity-4olg</link>
      <guid>https://dev.to/raj_tiwari_f987064d2f1827/transforming-data-with-logs-from-chaos-to-clarity-4olg</guid>
      <description>&lt;p&gt;&lt;strong&gt;“All models are wrong, but some are useful.”&lt;/strong&gt; &lt;br&gt;
— &lt;em&gt;George E. P. Box&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In the data world, this quote reminds us that while no transformation or model perfectly captures reality, some techniques make our data more useful. One such technique is log transformation, a simple yet powerful tool for making messy data more model-friendly and interpretable.&lt;/p&gt;
&lt;h2&gt;
  
  
  What is Log Transformation?
&lt;/h2&gt;

&lt;p&gt;Log transformation is a mathematical operation that converts a variable by taking its logarithm — often natural log (ln) or base-10 (log10). It’s primarily used to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Reduce skewness&lt;/li&gt;
&lt;li&gt;Handle large outliers&lt;/li&gt;
&lt;li&gt;Stabilize variance&lt;/li&gt;
&lt;li&gt;Convert exponential relationships into linear ones&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you have a variable x, then its log-transformed version is log(x). But be cautious: only positive values can be log-transformed.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why Use Log Transformation?
&lt;/h2&gt;
&lt;h2&gt;
  
  
  Normalize Skewed Data
&lt;/h2&gt;

&lt;p&gt;Many real-world variables like income, population, or sales are right-skewed — meaning most values are small, but a few are extremely large. Log transformation helps bring such distributions closer to normal (bell-shaped).&lt;/p&gt;
&lt;h2&gt;
  
  
  Reduce Impact of Outliers
&lt;/h2&gt;

&lt;p&gt;Log transformation compresses large numbers. A jump from 10 to 1000 becomes a jump from 1 to 3 on a log10 scale. This reduces the influence of extreme values on models and graphs.&lt;/p&gt;
&lt;h2&gt;
  
  
  Linearize Exponential Relationships
&lt;/h2&gt;

&lt;p&gt;Multiplicative models become additive in log space. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;y = a * x^b  →  log(y) = log(a) + b * log(x)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is especially useful for linear regression models, which assume additive relationships.&lt;/p&gt;

&lt;h2&gt;
  
  
  When Should You Use Log Transformation?
&lt;/h2&gt;

&lt;p&gt;Use it when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your data is highly right-skewed&lt;/li&gt;
&lt;li&gt;Variance increases with the mean&lt;/li&gt;
&lt;li&gt;You need to meet model assumptions (normality, linearity, etc.)&lt;/li&gt;
&lt;li&gt;You're working with exponential growth data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Avoid it when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your data includes zero or negative values&lt;/li&gt;
&lt;li&gt;You're using models that handle skewed data well (like decision trees)&lt;/li&gt;
&lt;li&gt;Interpretation becomes too complex for stakeholders&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  How to Apply Log Transformation in Python (with Pandas &amp;amp; NumPy)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import numpy as np
import pandas as pd

df = pd.DataFrame({'income': [30000, 50000, 70000, 200000, 1000000]})
df['log_income'] = np.log(df['income'])  # Natural log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Finance: Stock returns are often analyzed using log differences.&lt;/li&gt;
&lt;li&gt;Marketing: Ad spend vs. sales may follow an exponential curve, requiring log transformation.&lt;/li&gt;
&lt;li&gt;Epidemiology: Disease spread (like COVID-19) is often modeled with log-transformed case counts.&lt;/li&gt;
&lt;li&gt;Machine Learning: Log-transformed features improve regression model accuracy and reduce residual errors.&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>statistics</category>
      <category>data</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Analyzing Airbnb Listings in Chicago: A Power BI Dashboard Project</title>
      <dc:creator>Raj Tiwari</dc:creator>
      <pubDate>Mon, 07 Oct 2024 08:06:27 +0000</pubDate>
      <link>https://dev.to/raj_tiwari_f987064d2f1827/analyzing-airbnb-listings-in-chicago-a-power-bi-dashboard-project-53mg</link>
      <guid>https://dev.to/raj_tiwari_f987064d2f1827/analyzing-airbnb-listings-in-chicago-a-power-bi-dashboard-project-53mg</guid>
      <description>&lt;h2&gt;
  
  
  “In God we trust, all others bring data.”
&lt;/h2&gt;

&lt;p&gt;-W. Edward Deming&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;As the sharing economy continues to grow, Airbnb has become a major player in the short-term rental market, with thousands of listings worldwide. Analyzing this data can provide insights into trends, customer preferences, pricing strategies, and more. In this article, I’ll walk you through my journey of exploring and visualizing the Airbnb Chicago dataset using Power BI.&lt;/p&gt;

&lt;p&gt;The project was undertaken to deepen my understanding of data visualization while also analyzing real-world data for business insights. The dataset I used contains key details about Airbnb listings in Chicago—including pricing, location, availability, and host information.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dataset Overview
&lt;/h2&gt;

&lt;p&gt;The Airbnb Chicago dataset includes essential features like:&lt;/p&gt;

&lt;p&gt;1.&lt;strong&gt;Host Information&lt;/strong&gt;: Name, ID, and whether they are a superhost.&lt;br&gt;
2.&lt;strong&gt;Property Details&lt;/strong&gt;: Type of property (apartment, house, etc.), room type (entire home, private room, etc.), number of guests accommodated.&lt;br&gt;
3.&lt;strong&gt;Pricing&lt;/strong&gt;: Nightly rate, cleaning fees, extra guest charges.&lt;br&gt;
4.&lt;strong&gt;Location&lt;/strong&gt;: Neighborhoods and ZIP codes.&lt;br&gt;
5.&lt;strong&gt;Availability and Reviews&lt;/strong&gt;: Number of available days in a year, customer reviews, and review scores.&lt;/p&gt;

&lt;h2&gt;
  
  
  Objectives of the Analysis
&lt;/h2&gt;

&lt;p&gt;The main objectives of my analysis were:&lt;/p&gt;

&lt;p&gt;1.&lt;strong&gt;Price Distribution&lt;/strong&gt;: Understanding the price distribution across different neighborhoods and property types in Chicago.&lt;br&gt;
2.&lt;strong&gt;Host Insights&lt;/strong&gt;: Identifying how superhosts impact the pricing and reviews.&lt;br&gt;
3.&lt;strong&gt;Location Insights&lt;/strong&gt;: Exploring the most popular neighborhoods in terms of availability and reviews.&lt;br&gt;
4.&lt;strong&gt;Seasonality&lt;/strong&gt;: Analyzing how availability varies over time and how it affects pricing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Steps Involved
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1.Data Cleaning &amp;amp; Preparation&lt;/strong&gt;&lt;br&gt;
The first step was cleaning the dataset. Some rows had missing or incorrect values, which needed to be addressed. I removed irrelevant columns and filled missing values where applicable. I have used pandas and Numpy libraries for the data cleaning part.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2.Data Modeling in Power BI&lt;/strong&gt;&lt;br&gt;
In Power BI, I imported the cleaned dataset and built the necessary relationships between different variables like property type, price, and availability. Using DAX (Data Analysis Expressions), I calculated the average prices, availability percentages, and review counts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3.Building Visualizations&lt;/strong&gt;&lt;br&gt;
Power BI offers a range of visualizations, and I used several to present my findings clearly:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fws91klufn9qdb1uboza5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fws91klufn9qdb1uboza5.png" alt=" " width="407" height="246"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Chart&lt;/strong&gt;:Sum of reviews_per_month by room_type&lt;br&gt;
&lt;strong&gt;Insight&lt;/strong&gt;: The chart provides a clear visual representation of the relative popularity or review frequency of different accommodation types, with entire homes/apartments dominating the reviews.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsabthlywy570diaycj2y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsabthlywy570diaycj2y.png" alt=" " width="400" height="250"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Chart&lt;/strong&gt;:Count of apartment by room_type in percentage&lt;br&gt;
&lt;strong&gt;Insight&lt;/strong&gt;: The chart provides a clear visual representation of the proportion of each room type. Entire homes or apartments dominate the market, followed by private rooms, while shared rooms are a very small minority. This distribution likely reflects the preferences of travelers or the availability of different accommodation types on a platform like Airbnb.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1jk3pndcxe0b69nb68d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv1jk3pndcxe0b69nb68d.png" alt=" " width="352" height="217"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Chart&lt;/strong&gt;:No of days Availability of apartment by room_types&lt;br&gt;
&lt;strong&gt;Insight&lt;/strong&gt;:The chart illustrates that hotel rooms tend to be available for the most days on average, while shared rooms are available for the fewest days. There's a relatively small difference in availability between entire homes/apartments and private rooms, with entire homes/apartments being slightly more available on average&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q5ahagx7m9fx7gvwsv6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1q5ahagx7m9fx7gvwsv6.png" alt=" " width="386" height="246"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Chart&lt;/strong&gt;:Top 10 host by Reviews&lt;br&gt;
&lt;strong&gt;Insight&lt;/strong&gt;:This chart provides insight into the most active or popular hosts on the platform based on the number of reviews they've received.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5o0lajwijb7g9jk4nvkp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5o0lajwijb7g9jk4nvkp.png" alt=" " width="397" height="270"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Chart&lt;/strong&gt;:Sum of latitude by longitude and neighbourhood&lt;br&gt;
&lt;strong&gt;Insight&lt;/strong&gt;:This map likely represents the distribution of some kind of data points, possibly related to neighborhoods or specific locations, across North America. The concentrations suggest a focus on major urban areas or popular tourist destinations.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7j669j1n1a1pzanm8o2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm7j669j1n1a1pzanm8o2.png" alt=" " width="358" height="272"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Chart&lt;/strong&gt;:Average of price by neighbourhood&lt;br&gt;
&lt;strong&gt;Insight&lt;/strong&gt;:This chart provides a clear comparison of rental prices across different areas, which could be useful for travelers or for understanding the local real estate market.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Based on the charts and data presented, here's a concise conclusion for this project:&lt;br&gt;
This analysis appears to focus on short-term rental market dynamics, likely in Chicago. Key findings include:&lt;/p&gt;

&lt;p&gt;Entire homes/apartments dominate both listings (68.74%) and reviews, indicating high popularity.&lt;br&gt;
Hotel rooms have the highest availability (211 days/year), while shared rooms are least available (160 days/year).&lt;br&gt;
Top hosts receive thousands of reviews, with "Zencity" leading at 3,748 reviews.&lt;br&gt;
Pricing varies significantly by neighborhood, with West Englewood unexpectedly showing the highest average price ($537.67).&lt;br&gt;
Geographically, listings are concentrated in major urban areas, particularly on the East and West coasts of North America.&lt;/p&gt;

&lt;p&gt;This data provides insights into rental preferences, pricing strategies, and market distribution, which could be valuable for hosts, travelers, and platform operators in optimizing their approaches to the short-term rental market.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Next Steps&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In the future, I plan to extend this analysis by incorporating &lt;strong&gt;machine learning models&lt;/strong&gt; to predict pricing and availability trends based on factors such as location, host characteristics, and time of year. This would further enhance the practical value of the insights derived from the dataset.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Visuals and Code&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The complete Power BI dashboard and associated visuals can be accessed on my GitHub repository,&lt;a href="https://github.com/1111raj/Data_visualisation_powerBI_project" rel="noopener noreferrer"&gt;https://github.com/1111raj/Data_visualisation_powerBI_project&lt;/a&gt;  Feel free to explore the code, data, and insights.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Connect with Me&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;If you'd like to discuss this project or explore collaboration opportunities, feel free to reach out!&lt;/p&gt;

&lt;p&gt;LinkedIn: &lt;a href="https://www.linkedin.com/in/raj-tiwari-113b16b1/" rel="noopener noreferrer"&gt;https://www.linkedin.com/in/raj-tiwari-113b16b1/&lt;/a&gt;&lt;br&gt;
Email: &lt;a href="mailto:rajtiwaridata@gmail.com"&gt;rajtiwaridata@gmail.com&lt;/a&gt;&lt;br&gt;
Mobile: +91-9316432935&lt;/p&gt;

</description>
      <category>datascience</category>
      <category>dataengineering</category>
      <category>data</category>
    </item>
  </channel>
</rss>
