How I Began My Data Science Journey with R in the Last Month
Over the past month, I decided to dive seriously into data science, with one clear mission:
learn how to analyze real data using R like a professional.
To challenge myself, I worked on a complete e-commerce analytics project.
It ended up being demanding, sometimes frustrating, but incredibly rewarding.
Here is what I learned, how I progressed, and why this one-month experience became a turning point in my journey.
1. Getting Started: Learning to Work with R
At first, R looked unusual and a bit intimidating.
But once I started using the right libraries, everything became more natural:
-
dplyrfor data manipulation -
ggplot2for visualization -
readxlandread.csvfor importing data -
forecastfor my first time-series predictions
Writing pipelines with %>% even became enjoyable.
It felt like guiding the computer step-by-step through a clear thought process.
2. Learning to Structure a Real Data Project
A major lesson from this project: good organization matters.
I created separate scripts for each step of the analysis:
- data import & cleaning
- sales analysis
- product insights
- customer segmentation
- seller performance
- logistics & delivery
- service quality
- predictions
- visualizations
- and a main controller:
main.R
This approach taught me how data analysts build reproducible workflows, just like in professional environments.
3. The Most Challenging Part: Cleaning and Preparing the Data
I finally understood why people say:
“80% of data science is data cleaning.”
This project involved everything:
- date formats all over the place
- numeric values stored as text with commas
- inconsistent region names
- missing values
- merging multiple data sources
Fixing these issues helped me develop a deeper sense of how real datasets behave — and how to make them usable.
4. The Fun Part: Analyzing, Visualizing, and Understanding the Story
Once the data was clean, everything became much more exciting.
I analyzed:
- monthly, quarterly, and yearly revenue
- top-selling products
- customer segmentation (premium, standard, occasional)
- seller performance
- delivery delays
- service quality
- correlation between delivery delay and cancellations
Then came the charts:
line plots, barplots, scatter plots, heatmaps, and more.
This was the moment where the story hidden inside the data finally emerged.
Seasonal patterns showed up, certain categories dominated, and long delays clearly led to more cancellations.
The numbers weren’t just numbers anymore — they were insights.
5. My First Time-Series Predictions with ARIMA
Exploring time-series forecasting with auto.arima() was one of the most rewarding parts of the project.
I transformed the monthly revenue into a time series and predicted the next quarter.
Seeing R generate future values based on historical data made me feel like I had reached a new level:
“I’m really doing data science now.”
Conclusion: A Month That Changed the Way I Learn
This project was much more than a homework assignment.
It was a full immersion into the world of data science with R.
I learned how to:
- clean and structure real-world data
- analyze business performance
- build meaningful visualizations
- create predictive models
- and organize a full analytical workflow
Most importantly, this one-month journey gave me confidence and motivation to continue.
And honestly?
This is just the beginning.
Top comments (0)