DEV Community

nk_Enuke
nk_Enuke

Posted on

Kaggle Diary: MITSUI&CO. Commodity Prediction Challenge: Day 1

Competition URL

https://www.kaggle.com/competitions/mitsui-commodity-prediction-challenge/rules#7-competition-data

Overview

The objective of this competition is to develop accurate and stable prediction models for commodity prices, which seems relatively straightforward. The data is lightweight, and unlike recent competitions that require high-spec PCs just to participate, I felt this was something I could actually join, so I decided to give it a try.

Data

The dataset consists of multiple time series data obtained from financial markets worldwide, including various financial products such as metals, futures, US stocks, and foreign exchange. The markets include:

  • LME: London Metal Exchange
  • JPX: Japan Exchange Group
  • US: Various US securities exchanges
  • FX: Foreign Exchange

Key features:

  • 1,977 columns - quite a large-scale dataset
  • 4 markets: LME (London metals), JPX (Japan), US (US stocks), FX (foreign exchange)
  • 424 targets: Single commodity returns and differences between commodity pairs
  • 1-4 day lags: Split into test_labels_lag_[1-4].csv
  • Need to consider lags due to financial institution holidays and processing time

Prediction

Apparently, the actual leaderboard isn't very useful, as the evaluation is based on how well predictions match actual values after the competition ends.

Code Reading

Reading public code shared by maverick_ss_26.

Examining Code to Verify if the Leaderboard is Really Invalid

https://www.kaggle.com/code/maverickss26/commodity-price-prediction-v1

Preprocessing and Other Tips

  • Check data quality by summing null values: .isnull().sum()
  • Display missing values using histograms
  • Use Prophet time series model
  • Express price volatility using standard deviation

About Prophet Model Options

changepoint_prior_scale=0.05,  # Trend change flexibility (smaller = more stable)
interval_width=0.95  # 95% confidence interval for predictions
Enter fullscreen mode Exit fullscreen mode

General Example of Prophet Usage

from prophet import Prophet
import pandas as pd

# Data preparation (required: ds and y columns)
df = pd.DataFrame({
    'ds': ['2024-01-01', '2024-01-02', '2024-01-03'],  # Dates
    'y': [100, 120, 110]  # Values to predict
})

# Create and train model
model = Prophet()
model.fit(df)

# Create future dates
future = model.make_future_dataframe(periods=7)  # 7 days ahead

# Execute prediction
forecast = model.predict(future)
print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']])
Enter fullscreen mode Exit fullscreen mode

It can be used quite easily like this.
It appears that the test data contains training data, which shows that the leaderboard is meaningless.

Top comments (0)