DEV Community

Elie
Elie

Posted on

Updating a 3-Year-Old Project: Japanese Property Price Forecasting Gets a Modern Pipeline

Three years ago, I started the Japanese Property Price Forecasting project to analyze and predict second-hand apartment prices in Japan. It was a great learning experience, but like many early projects, it lacked a structured pipeline and modern machine learning best practices.

Recently, I revisited the notebook and modernized the workflow. Here’s what I added and improved.


What Changed

1. Modern Preprocessing with Pipelines

Previously, the notebook handled categorical and numeric features manually. Now, it uses a full scikit-learn Pipeline to cleanly separate numeric and categorical preprocessing:

  • Numeric features: imputation with mean + standard scaling
  • Categorical features: imputation + one-hot encoding (handling unknown categories safely)

This makes the workflow cleaner, easier to maintain, and fully reproducible.

2. LightGBM Regressor with Hyperparameter Tuning

Instead of a simple regression, the updated notebook uses LightGBM with hyperparameter tuning. This allows the model to achieve better performance while keeping the training process efficient.

3. Improved Column Handling

Old column names had spaces and special characters, which could cause issues with pipelines. All columns are now sanitized (_ instead of spaces and symbols), making the notebook pipeline-friendly and more robust for experimentation.

4. Flexible Data Loading

The notebook now loads CSVs dynamically, making it easy to handle multiple files without hardcoding paths.


Why This Matters

Updating old projects is a great way to apply what you’ve learned over time. In this case:

  • Cleaner, maintainable code
  • Better reproducibility
  • Stronger model performance
  • Ready for future extensions (feature engineering, explainability, or deployment)

Check It Out

You can explore the updated notebook here:

Japanese_Property_Price_Forecasting_v2.ipynb

If you’re curious about structured ML pipelines or modernizing legacy notebooks, this is a small but practical example of the process.

Top comments (0)