for IT the

Posted on Jun 11 • Originally published at kaggle.com

Top 5 Image Dehazing Datasets Every Computer Vision Researcher Must Know

#computervision #dehaze #imageprocessing #deeplearning

A Complete Guide to Hazy-Clean Paired Datasets, Haze Types, Metrics, Models, and Implementation — With Final Year Project Angles for Researchers, PhD, M.Tech, and Final Year Students

Who is this for? Final year B.Tech/M.Tech students building a dehazing project, PhD researchers benchmarking new architectures, and CV practitioners who need to understand which dataset to trust and why. Every section is written to save you the 40+ hours of scattered paper-reading that most researchers go through before picking a dataset.

Table of Contents

Introduction
What Makes a Great Image Dehazing Dataset?
Dataset 1 — RESIDE
Dataset 2 — O-Haze
Dataset 3 — I-Haze
Dataset 4 — NH-Haze
Dataset 5 — Dense-Haze
Image Dehazing Metrics Explained
Comparison Table — All 5 Datasets Across 12+ Attributes
How to Choose the Right Dataset
Common Dehazing Models Benchmarked
How to Prepare Hazy-Clean Pairs for Training
Research Gap Radar — 5 Open Problems
Implementation Roadmap — 8-Step Week-by-Week Guide
Tools and Frameworks
7 Common Mistakes Researchers Make
Your Next Steps + Conclusion

Introduction

What is Image Dehazing?

Image dehazing is the process of recovering a clear, haze-free image J from a hazy observation I, where atmospheric scattering has degraded contrast, colour fidelity, and visibility. The classical physical model that governs this degradation is the Atmospheric Scattering Model (ASM):

where:

I(x) is the observed hazy image at pixel x
J(x) is the scene radiance (the clean image we want to recover) -** t(x)** is the transmission map — the fraction of light that reaches the camera without being scattered
A is the global atmospheric light (the colour of the haze, typically a bright greyish-white)

The transmission map is related to scene depth d(x) and the atmospheric scattering coefficient β by:

This means that distant objects (large d) have very low transmission — they are almost completely obscured by haze — while nearby objects retain most of their original appearance. Dehazing algorithms work by estimating t(x) and A from I(x) alone, then inverting the ASM **to recover **J(x).

Three Eras of Dehazing Research

Prior-based era (2000s–2014): Methods like Dark Channel Prior (DCP) by He et al. (CVPR 2009, TPAMI 2011) exploited the statistical observation that in most haze-free image patches, at least one colour channel has very low intensity. DCP-based dehazing became the foundational baseline that all subsequent methods compare against.
Deep learning era (2016–2020): CNNs such as DehazeNet, MSCNN, AOD-Net, and GFN learned end-to-end mappings from hazy to clean images, dramatically outperforming prior-based methods on benchmark datasets. These methods were trained and evaluated almost exclusively on synthetic haze.
Transformer and physics-guided era (2020–present): Models like FFA-Net, MAXIM, DehazeFormer, and Dehamer use attention mechanisms, multi-scale feature fusion, and explicit physical priors simultaneously. The critical challenge of this era is the synthetic-to-real gap: models trained on synthetic haze often fail on real outdoor haze despite achieving high PSNR on benchmarks.

Why Datasets Matter in Dehazing

Dehazing has a unique dataset challenge that other restoration tasks do not share: obtaining ground-truth clean images for real hazy scenes is extremely difficult. You cannot photograph the same scene on a clear day and a hazy day and simply compare them — the lighting, time of day, and seasonal variation will have changed. This fundamental difficulty has driven the community to develop multiple creative dataset collection strategies, each with its own trade-offs.

Understanding which dataset uses which strategy — and what limitations that introduces — is essential for correctly interpreting benchmark results and designing your own research.

How This Article is Structured

Each of the five dataset sections follows the same 12-subsection template: overview, origin, haze characteristics, image statistics, download/access, metadata block, licence, how researchers use it, code to load it, reported state-of-the-art numbers, known limitations, research angles, and a quick-reference summary card. After the datasets, you get metrics, model benchmarks, data preparation recipes, a research gap radar, a week-by-week implementation roadmap, and the tooling ecosystem.

What Makes a Great Image Dehazing Dataset?

1. Haze Type Coverage

Haze in the real world is not a single phenomenon. Homogeneous haze is uniform across the image — a simple fog where density does not vary much spatially. Heterogeneous (non-homogeneous) haze varies spatially — thick patches next to thin patches, common in morning mist and industrial smog. Dense haze almost completely obscures distant scene content with transmission values near zero. A great dataset should clearly document which type it contains, because models trained on homogeneous synthetic haze fail dramatically on non-homogeneous real haze.

2. Synthetic vs Real Haze

Synthetic haze is generated by applying the ASM to clean images using depth maps, yielding perfectly aligned hazy-clean pairs with precisely known A and t(x). Synthetic data enables controlled training and full-reference evaluation but misses the spectral, spatial, and dynamic complexity of real outdoor haze.

Real haze is captured in actual foggy or hazy weather conditions. Getting clean ground truth requires using a haze machine in a controlled indoor setting (I-Haze, O-Haze), capturing the scene before and after haze (challenging outdoors), or accepting that no ground truth is available (unpaired real haze datasets like RTTS).

3. Indoor vs Outdoor Scenes

Indoor dehazing (I-Haze) and outdoor dehazing (O-Haze) involve fundamentally different illumination conditions, depth ranges, and haze density profiles. A model trained only on indoor data will fail outdoors because the depth range — and therefore the transmission variation across the image — is completely different.

4. Image Diversity and Scale

A dataset with 10 images is suitable only for evaluation, not training. A dataset with 10,000 image pairs covers a range of textures, depths, and haze densities sufficient for training robust models. Diversity also means coverage across time of day, season, and weather type.

5. Licence and Accessibility

The dehazing community is smaller than the denoising community and some datasets are less formally licensed. Always check whether a dataset allows commercial use, requires citation, or is restricted to academic research before building a product or open-source tool.

Dataset 1 — RESIDE

1.1 Overview

RESIDE (Real-world hazy Images for Single Image DEhazing) is the largest and most comprehensive image dehazing dataset ever created and is the undisputed standard for dehazing benchmarking. Published by Li et al. at IEEE TIP 2019, RESIDE combines synthetic indoor and outdoor hazy images at large scale with a carefully curated real-world evaluation set. If you have read any deep learning dehazing paper since 2018, you have almost certainly seen PSNR and SSIM numbers on RESIDE's SOTS (Synthetic Objective Testing Set) subset.

1.2 Origin and History

RESIDE was created by researchers at Hefei University of Technology and the University of Maryland. The authors recognised that existing dehazing datasets were either too small (a few dozen images) or evaluated only on synthetic data, creating a gap between benchmark performance and real-world results. RESIDE was designed to bridge this gap by providing multiple subsets covering different haze types, densities, and evaluation protocols.

The dataset was introduced alongside the RESIDE benchmark challenge and has been updated in multiple versions. RESIDE-V0 (the original) and RESIDE-6K (a curated 6,000-pair training subset) are the most commonly used variants in recent papers.

1.3 Haze Characteristics

RESIDE uses synthetic haze generated by applying the ASM to depth maps:

**Indoor subset (ITS — Indoor Training Set): **13,990 hazy images generated from 1,399 clean indoor images using synthetic depth maps and multiple A and β values per image.
Outdoor subset (OTS — Outdoor Training Set): 313,950 hazy images from clean outdoor images with synthetic haze. The scale of OTS is unique in the dehazing field.
SOTS (Synthetic Objective Testing Set): 500 indoor + 500 outdoor test images with ground truth. This is the standard evaluation split.
**HSTS (Hybrid Subjective Testing Set): **10 synthetic + 10 real hazy images for perceptual evaluation without ground truth on the real subset.
RTTS (Real-world Task-driven Testing Set): 4,322 real hazy images without ground truth, for qualitative evaluation only.

Haze parameters: atmospheric light A ∈ 0.7, 1.0, scattering coefficient β ∈ [0.04, 0.20] for outdoor, β ∈ [0.6, 1.8] for indoor.

1.4 Image Statistics

Attribute	Value
Indoor training pairs (ITS)	13,990 hazy + 1,399 clean
Outdoor training pairs (OTS)	313,950 hazy + 8,970 clean
SOTS test pairs	500 indoor + 500 outdoor
RTTS real images (no GT)	4,322
Resolution	Varies: 460×620 (indoor), up to 1024×1024 (outdoor)
Haze type	Synthetic homogeneous
Depth maps	Included for ITS
Colour space	RGB

1.5 Download and Access

Official RESIDE page: https://sites.google.com/view/reside-dehaze-datasets/
RESIDE-6K (compact training set): https://github.com/liuye123321/DMT-Net — linked in the repository README
Kaggle mirror: Search "RESIDE dehazing dataset" on Kaggle for community-hosted versions of ITS and SOTS
Direct Google Drive links for each subset are available on the official page

1.6 Dataset Metadata

1.7 Licence

RESIDE is released for non-commercial research and education use only. All publications using RESIDE must cite the original IEEE TIP paper. Commercial use requires written permission from the dataset creators.

1.8 How Researchers Use RESIDE

Standard training protocol: Train on ITS (indoor) or OTS (outdoor) or both. Most recent papers use ITS for indoor evaluation and OTS for outdoor evaluation separately, since models trained on indoor data do not generalise well to outdoor haze due to different depth ranges.

Standard evaluation protocol: Report PSNR and SSIM on SOTS-Indoor (500 pairs) and SOTS-Outdoor (500 pairs) separately. Always specify which SOTS subset you are using — "SOTS" alone is ambiguous.

Qualitative evaluation: Run inference on RTTS images and include visual comparisons in your paper. Since RTTS has no ground truth, only qualitative assessment is possible.

1.9 Code to Load RESIDE

import os
import numpy as np
from PIL import Image
from torch.utils.data import Dataset

class RESIDEDataset(Dataset):
    """
    Dataset loader for RESIDE ITS or SOTS.

    Directory structure expected:
      root/
        hazy/   <- hazy images (named e.g. 1_1.png, 1_2.png for ITS)
        clear/  <- clean images (named e.g. 1.png for ITS)

    For ITS, multiple hazy images correspond to one clean image.
    The naming convention is {clean_id}_{haze_param_id}.png
    """

    def __init__(self, root, mode='its', transform=None):
        self.root = root
        self.transform = transform
        self.hazy_dir = os.path.join(root, 'hazy')
        self.clean_dir = os.path.join(root, 'clear')

        self.hazy_files = sorted(os.listdir(self.hazy_dir))

        if mode == 'its':
            # For ITS: extract clean image ID from hazy filename
            self.clean_map = {
                h: h.split('_')[0] + '.png'
                for h in self.hazy_files
            }
        else:
            # For SOTS: 1-to-1 correspondence
            self.clean_map = {h: h for h in self.hazy_files}

    def __len__(self):
        return len(self.hazy_files)

    def __getitem__(self, idx):
        hazy_name = self.hazy_files[idx]
        clean_name = self.clean_map[hazy_name]

        hazy = np.array(
            Image.open(
                os.path.join(self.hazy_dir, hazy_name)
            )
        ).astype(np.float32) / 255.0

        clean = np.array(
            Image.open(
                os.path.join(self.clean_dir, clean_name)
            )
        ).astype(np.float32) / 255.0

        if self.transform:
            hazy, clean = self.transform(hazy, clean)

        # Convert to (C, H, W) for PyTorch
        hazy = hazy.transpose(2, 0, 1)
        clean = clean.transpose(2, 0, 1)

        return hazy, clean


def compute_psnr_ssim_reside(
    model,
    sots_root,
    split='indoor'
):
    """
    Evaluate a dehazing model on RESIDE SOTS.
    """

    import torch
    from skimage.metrics import peak_signal_noise_ratio as psnr
    from skimage.metrics import structural_similarity as ssim

    dataset = RESIDEDataset(
        os.path.join(sots_root, split),
        mode='sots'
    )

    psnr_list = []
    ssim_list = []

    for hazy_t, clean_t in dataset:
        with torch.no_grad():
            output = model(
                hazy_t.unsqueeze(0)
            ).squeeze(0)

        output_np = (
            output.permute(1, 2, 0)
            .cpu()
            .numpy()
            .clip(0, 1)
        )

        clean_np = (
            clean_t.permute(1, 2, 0)
            .cpu()
            .numpy()
        )

        psnr_list.append(
            psnr(
                clean_np,
                output_np,
                data_range=1.0
            )
        )

        ssim_list.append(
            ssim(
                clean_np,
                output_np,
                data_range=1.0,
                multichannel=True,
                channel_axis=2
            )
        )

    print(
        f"RESIDE SOTS-{split} | "
        f"PSNR: {np.mean(psnr_list):.2f} dB | "
        f"SSIM: {np.mean(ssim_list):.4f}"
    )

    return (
        np.mean(psnr_list),
        np.mean(ssim_list)
    )

1.10 State-of-the-Art Numbers on RESIDE SOTS

SOTS-Indoor:

Model	Year	PSNR (dB)	SSIM
DCP	2011	16.62	0.818
DehazeNet	2016	21.14	0.847
AOD-Net	2017	20.29	0.877
MSBDN	2020	33.67	0.985
FFA-Net	2020	36.39	0.989
MAXIM	2022	38.11	0.991
DehazeFormer-B	2023	40.05	0.994

SOTS-Outdoor:

Model	Year	PSNR (dB)	SSIM
DCP	2011	19.13	0.815
AOD-Net	2017	24.14	0.920
GDN	2019	30.86	0.982
MSBDN	2020	33.48	0.982
FFA-Net	2020	33.57	0.984
DehazeFormer-B	2023	34.81	0.986

1.11 Known Limitations

**Synthetic haze only (ITS/OTS/SOTS). **The ASM-based synthetic haze is spatially homogeneous and does not capture real-world haze complexity. Models achieving 40+ dB on SOTS-Indoor can look visually poor on real outdoor photos.
Depth map quality for ITS. Indoor depth maps from datasets like NYU Depth V2 used in RESIDE generation have noise and errors that introduce artefacts in synthetic haze.
Scale imbalance. OTS is 300K+ images; most research groups train on ITS only due to compute constraints, meaning OTS is underutilised.
No heterogeneous haze. All RESIDE synthetic haze is generated with a single global A per image — real haze has spatially varying A, which this dataset does not capture.

1.12 Research Angles for Final Year / PhD Students

**Synthetic-to-real domain adaptation: **Train on RESIDE-ITS and evaluate on O-Haze or NH-Haze. Quantify the domain gap and propose an adaptation method.
**Depth-aware dehazing: **Use the depth maps included with RESIDE-ITS to design a depth-conditioned architecture that varies its processing by scene distance.
**RTTS as an unsupervised training signal: **Apply Noise2Noise-style or contrastive learning using RTTS real hazy images without ground truth.
**Lightweight dehazing on OTS: **Most papers train on ITS; use the full OTS scale to train a lightweight model and show that scale compensates for reduced model capacity.

1.13 Quick Reference Card

*RESIDE *| 13,990 indoor + 313,950 outdoor training pairs | SOTS: 1,000 test pairs | Synthetic homogeneous haze | Non-commercial research licence | Use as: training + standard benchmark | Primary metrics: PSNR, SSIM on SOTS | Download: sites.google.com/view/reside-dehaze-datasets/

Dataset 2 — O-Haze

2.1 Overview

O-Haze (Outdoor Haze dataset) is the first dataset to provide real outdoor hazy-clean image pairs captured using a professional haze machine. Released by Ancuti et al. at CVPR Workshop 2018, O-Haze addresses the fundamental limitation of purely synthetic benchmarks by providing genuine optical haze — the same atmospheric scattering physics that occurs in real foggy weather, reproduced in a controlled outdoor setting. It is one of the NTIRE 2018 and NTIRE 2019 challenge datasets, giving it significant community visibility.

2.2 Origin and History

O-Haze was created at the Multimedia Lab of Hasselt University, Belgium. The collection methodology was carefully designed: 45 outdoor scenes were photographed both with and without haze produced by a professional haze machine positioned to fill the scene. The haze machine produces water-droplet-based aerosol that mimics the optical properties of natural outdoor haze and fog. This approach yields perfectly aligned hazy-clean pairs — the camera is fixed, only the haze presence changes between the two captures.

O-Haze was one of the first datasets to allow researchers to quantitatively evaluate real-haze dehazing with full-reference metrics, filling a critical gap between synthetic benchmarks and purely qualitative real-world testing.

2.3 Haze Characteristics

O-Haze haze is real, optically generated using a professional haze/fog machine:

Haze type: Dense, relatively homogeneous fog produced by atomised water particles.
Spatial variation: Some spatial variation exists due to wind, outdoor air movement, and distance from the haze machine — making it more realistic than synthetic homogeneous haze.
Depth correlation: Distant objects are more obscured than near objects, consistent with the ASM model.
Colour shift: Real haze introduces a whitish colour cast that varies slightly from image to image depending on ambient lighting and haze density — unlike the constant A used in synthetic datasets.
Haze density: Generally dense — visibility is significantly reduced in most image pairs.

2.4 Image Statistics

Attribute	Value
Total image pairs	45 (hazy + clean)
Training pairs	40
Validation pairs	5
Test pairs (withheld for challenge)	5
Resolution	4964×3312 pixels (original, ~16 MP)
Standard evaluation resolution	Resized to 512×512 or 1024×1024
Haze type	Real (haze machine), outdoor
Scene content	Gardens, streets, paths, vegetation
Colour space	RGB

2.5 Download and Access

Official O-Haze page (NTIRE challenge):

https://data.vision.ee.ethz.ch/cvl/ntire18/o-haze/
Direct download: Available from the official page — requires accepting a short terms-of-use agreement.
Papers with code listing:

https://paperswithcode.com/dataset/o-haze
Google Drive mirror: Linked from several GitHub repositories including the NTIRE 2018 challenge GitHub.

2.6 Dataset Metadata

Field	Detail
Official Download	https://data.vision.ee.ethz.ch/cvl/ntire18/o-haze/
Published	April 2018 (CVPR Workshop)
License	Non-commercial research use
Authors	Codruta O. Ancuti, Cosmin Ancuti, Radu Timofte, Christophe De Vleeschouwer
File size	~2.1 GB
Citation	Ancuti et al., "O-HAZE: A Dehazing Benchmark with Real Hazy and Haze-Free Outdoor Images," CVPRW 2018

2.7 Licence

O-Haze is available for non-commercial research and educational use. Citation of the CVPR Workshop 2018 paper is required. The dataset was released as part of the NTIRE 2018 challenge and remains hosted by ETH Zurich's Computer Vision Laboratory.

2.8 How Researchers Use O-Haze

O-Haze is used almost exclusively as a test dataset due to its relatively small size (45 image pairs). The standard evaluation protocol is:

Train on RESIDE-ITS or OTS (synthetic datasets).
Fine-tune or directly evaluate on O-Haze to assess real-haze generalisation.
Report PSNR and SSIM on the 40-pair training set (in some papers) or the 5-pair validation set.
Submit results to the NTIRE challenge evaluation server for official benchmark numbers.

Papers that achieve strong performance on RESIDE-SOTS but perform poorly on O-Haze typically reveal a synthetic-to-real gap. This gap highlights the importance of real-world evaluation, making O-Haze a key benchmark for validating practical dehazing algorithms.

2.9 Code to Load O-Haze

The following example demonstrates how to create a PyTorch dataset loader for O-Haze. Since O-Haze provides perfectly aligned hazy and haze-free image pairs, loading the dataset follows a straightforward paired-image approach.

import os
import numpy as np
from PIL import Image
from glob import glob
from skimage.metrics import peak_signal_noise_ratio as psnr
from skimage.metrics import structural_similarity as ssim

def load_ohaze_pairs(ohaze_root, split='train', target_size=(512, 512)):
    """
    Load O-Haze hazy-clean pairs.

    Expected structure:
      ohaze_root/
        hazy/   <- hazy images (e.g. 01_outdoor_hazy.jpg)
        GT/     <- clean ground truth (e.g. 01_outdoor_GT.jpg)
    """
    hazy_files = sorted(glob(os.path.join(ohaze_root, 'hazy', '*.jpg')))
    gt_files   = sorted(glob(os.path.join(ohaze_root, 'GT', '*.jpg')))

    pairs = []
    for hf, gf in zip(hazy_files, gt_files):
        hazy  = Image.open(hf).convert('RGB')
        clean = Image.open(gf).convert('RGB')

        if target_size:
            hazy  = hazy.resize(target_size, Image.LANCZOS)
            clean = clean.resize(target_size, Image.LANCZOS)

        hazy_np  = np.array(hazy).astype(np.float32) / 255.0
        clean_np = np.array(clean).astype(np.float32) / 255.0
        pairs.append((hazy_np, clean_np))

    print(f"Loaded {len(pairs)} O-Haze pairs at {target_size}")
    return pairs

def evaluate_ohaze(model_fn, pairs):
    """Evaluate a dehazing model on O-Haze pairs."""
    psnr_list, ssim_list = [], []
    for hazy, clean in pairs:
        dehazed = model_fn(hazy)
        p = psnr(clean, dehazed, data_range=1.0)
        s = ssim(clean, dehazed, data_range=1.0,
                 multichannel=True, channel_axis=2)
        psnr_list.append(p)
        ssim_list.append(s)
    print(f"O-Haze | PSNR: {np.mean(psnr_list):.2f} dB | "
          f"SSIM: {np.mean(ssim_list):.4f}")
    return np.mean(psnr_list), np.mean(ssim_list)

2.10 State-of-the-Art Numbers on O-Haze

Model	Year	PSNR (dB)	SSIM
DCP	2011	15.78	0.561
MSCNN	2016	17.56	0.810
AOD-Net	2017	15.03	0.527
GFN	2018	21.55	0.844
FPDN	2019	22.57	0.863
FFA-Net	2020	22.12	0.858
AECR-Net	2021	23.43	0.879
DehazeFormer-B	2023	24.16	0.891

Note: O-Haze PSNR values are substantially lower than RESIDE-SOTS values. This is expected because O-Haze contains real atmospheric haze rather than synthetic haze. Direct comparison of absolute PSNR and SSIM values across different datasets is not recommended.

2.11 Known Limitations

Only 45 pairs: The dataset is extremely small and cannot be used for training deep models from scratch. It is primarily intended for evaluation and fine-tuning experiments.
Haze machine ≠ natural haze: The water-droplet aerosol generated by a professional haze machine differs from natural atmospheric haze in particle-size distribution, humidity, and spectral characteristics. Models trained on O-Haze may not fully generalize to real-world weather conditions.
Controlled outdoor setting: All scenes were captured in a similar garden and outdoor environment. Scene diversity is limited compared to large-scale benchmarks.
No depth information: Unlike RESIDE, O-Haze does not provide depth maps, transmission maps, or atmospheric-light annotations.
Resolution mismatch: Original images are approximately 4964×3312 pixels (~16 MP), but most studies evaluate at resized resolutions such as 512×512 or 1024×1024, introducing potential scaling artifacts.

2.12 Research Angles for Final Year / PhD Students

Domain-gap measurement: Train a model on RESIDE-OTS and evaluate it directly on O-Haze. Measure the synthetic-to-real performance gap and identify scene categories (vegetation, sky, roads, shadows) that suffer the most degradation.
Fine-tuning with limited real data: Investigate how fine-tuning on only a small subset of O-Haze pairs affects generalization. Compare different adaptation strategies and determine the minimum number of real-haze samples required.
Perceptual quality optimization: Analyze why PSNR-optimized dehazing methods sometimes produce halo artifacts or unnatural colors on O-Haze. Explore perceptual, adversarial, or human-visual-system-based losses to improve visual quality.
Multi-scale evaluation: Compare model performance at original resolution (4964×3312) and resized resolutions. Quantify how downsampling influences haze-removal quality and fine-detail restoration.
Real-haze robustness benchmarking: Evaluate recent transformer-based and diffusion-based dehazing models on O-Haze to determine whether improvements on synthetic datasets translate to real-world haze.
Cross-dataset generalization: Train on RESIDE, Dense-Haze, or NH-Haze and test on O-Haze. Analyze which datasets contribute most to real-world generalization.
Failure-case analysis: Categorize common failure modes such as color shifts, over-enhancement, sky artifacts, and texture loss. Build a taxonomy of errors to guide future model development.

2.13 Quick Reference Card

O-Haze | 45 real outdoor pairs | Real optical haze (haze machine) | Non-commercial research license | Released: 2018 (CVPR Workshop) | Primary metrics: PSNR, SSIM | Resolution: 4964×3312 (~16 MP) | Download: https://data.vision.ee.ethz.ch/cvl/ntire18/o-haze/

Dataset 3 — I-Haze

3.1 Overview

I-Haze (Indoor Haze dataset) is the indoor counterpart to O-Haze, released by the same Hasselt University team at CVPR Workshop 2018. The dataset contains real hazy-clean image pairs captured in a controlled indoor environment using a professional haze machine. It serves as one of the primary benchmarks for indoor image dehazing and was used in the NTIRE 2018 and NTIRE 2019 challenge tracks.

3.2 Origin and History

I-Haze was motivated by the observation that indoor haze and smoke (originating from cooking fumes, industrial ventilation failures, fires, and cigarette smoke) present challenges distinct from outdoor haze. The dataset was created using the same methodology as O-Haze, where a professional haze machine filled a controlled indoor environment with dense haze while maintaining identical camera positions for the hazy and haze-free captures.

The dataset was released with predefined training and validation splits and supported NTIRE challenge evaluations through a centralized evaluation server, enabling reproducible benchmarking and fair comparison of dehazing methods.

3.3 Haze Characteristics

I-Haze uses real, optically generated indoor haze:

Haze distribution: More spatially uniform than O-Haze because indoor environments have less air movement. This makes I-Haze closer to the assumptions of the Atmospheric Scattering Model (ASM).
Depth range: Indoor scenes typically contain shorter depth ranges (3–8 meters) compared to outdoor scenes. Transmission values are therefore generally higher than in outdoor haze datasets.
Colour temperature: Indoor scenes often contain mixed illumination sources (artificial lighting combined with natural window light), creating additional color-restoration challenges.
Haze density: Dense enough to significantly reduce visibility and contrast while introducing a noticeable whitish haze layer across the scene.

3.4 Image Statistics

Attribute	Value
Total image pairs	35 (hazy + clean)
Training pairs	25
Validation pairs	5
Test pairs (withheld)	5
Resolution	2833×4256 pixels (original, ~12 MP)
Standard evaluation resolution	Resized to 512×512 or 1024×1024
Haze type	Real (haze machine), indoor
Scene content	Living rooms, offices, corridors, furniture, household objects
Colour space	RGB

3.5 Download and Access

Official I-Haze page (NTIRE challenge):

https://data.vision.ee.ethz.ch/cvl/ntire18/i-haze/
Direct download: Available from the official page after accepting the dataset terms of use.
Papers with Code listing:

https://paperswithcode.com/dataset/i-haze
Combined I-Haze + O-Haze downloads: Community mirrors and GitHub repositories occasionally host both datasets together. Search for "NTIRE 2018 dehazing dataset" to locate available mirrors.

3.6 Dataset Metadata

Field	Detail
Official Download	https://data.vision.ee.ethz.ch/cvl/ntire18/i-haze/
Published	April 2018 (CVPR Workshop)
License	Non-commercial research use
Authors	Codruta O. Ancuti, Cosmin Ancuti, Radu Timofte, Christophe De Vleeschouwer
File Size	~1.4 GB
Citation	Ancuti et al., "I-HAZE: A Dehazing Benchmark with Real Hazy and Haze-Free Indoor Images," CVPRW 2018

3.7 Licence

I-Haze is available for non-commercial research and educational use. Citation of the CVPR Workshop 2018 paper is required when publishing results. The dataset continues to be hosted and maintained through ETH Zurich's Computer Vision Laboratory as part of the NTIRE benchmark collection.

3.8 How Researchers Use I-Haze

I-Haze is primarily used as an indoor real-haze evaluation benchmark. A typical evaluation workflow is:

Train a model on RESIDE-ITS or RESIDE-OTS synthetic indoor haze datasets.
Evaluate the trained model directly on I-Haze without additional training.
Measure the synthetic-to-real domain gap using PSNR and SSIM.
Compare results with O-Haze to assess indoor versus outdoor generalization.

Many research papers report results on both I-Haze and O-Haze because the two datasets provide complementary real-world evaluation environments. A model that performs well on RESIDE-ITS but poorly on I-Haze often indicates overfitting to synthetic haze characteristics rather than learning physically meaningful haze removal.

3.9 Code to Load I-Haze

The following PyTorch dataset loader can be used to load paired hazy and haze-free images from I-Haze.

import os
import numpy as np
from PIL import Image
from glob import glob

def load_ihaze_pairs(ihaze_root, target_size=(512, 512)):
    """
    Load I-Haze hazy-clean pairs.

    Expected structure:
      ihaze_root/
        hazy/   <- hazy images (e.g. 01_indoor_hazy.jpg)
        GT/     <- clean ground truth (e.g. 01_indoor_GT.jpg)
    """
    hazy_files = sorted(glob(os.path.join(ihaze_root, 'hazy', '*.jpg')))
    gt_files   = sorted(glob(os.path.join(ihaze_root, 'GT', '*.jpg')))

    assert len(hazy_files) == len(gt_files), \
        f"Mismatch: {len(hazy_files)} hazy vs {len(gt_files)} clean"

    pairs = []
    for hf, gf in zip(hazy_files, gt_files):
        hazy  = np.array(
            Image.open(hf).convert('RGB').resize(
                target_size, Image.LANCZOS
            )
        ).astype(np.float32) / 255.0
        clean = np.array(
            Image.open(gf).convert('RGB').resize(
                target_size, Image.LANCZOS
            )
        ).astype(np.float32) / 255.0
        pairs.append((hazy, clean))

    print(f"Loaded {len(pairs)} I-Haze pairs")
    return pairs

def patch_based_dehazing_eval(model_fn, pairs, patch_size=256, stride=256):
    """
    Evaluate model using patch-based inference (for memory-limited GPUs).
    Useful when evaluating at full resolution on I-Haze's large originals.
    """
    from skimage.metrics import peak_signal_noise_ratio as psnr
    from skimage.metrics import structural_similarity as ssim
    import numpy as np

    psnr_list, ssim_list = [], []
    for hazy, clean in pairs:
        H, W, _ = hazy.shape
        output = np.zeros_like(hazy)
        count  = np.zeros((H, W, 1))

        for y in range(0, H - patch_size + 1, stride):
            for x in range(0, W - patch_size + 1, stride):
                patch = hazy[y:y+patch_size, x:x+patch_size]
                denoised_patch = model_fn(patch)
                output[y:y+patch_size, x:x+patch_size] += denoised_patch
                count[y:y+patch_size, x:x+patch_size] += 1

        output = (output / count.clip(min=1)).clip(0, 1)
        psnr_list.append(psnr(clean, output, data_range=1.0))
        ssim_list.append(ssim(clean, output, data_range=1.0,
                               multichannel=True, channel_axis=2))

    print(f"I-Haze patch eval | PSNR: {np.mean(psnr_list):.2f} dB | "
          f"SSIM: {np.mean(ssim_list):.4f}")
    return psnr_list, ssim_list

3.10 State-of-the-Art Numbers on I-Haze

Model	Year	PSNR (dB)	SSIM
DCP	2011	14.43	0.754
MSCNN	2016	15.22	0.785
AOD-Net	2017	16.72	0.820
GFN	2018	22.30	0.880
FPDN	2019	22.83	0.888
FFA-Net	2020	23.75	0.912
AECR-Net	2021	24.02	0.915
DehazeFormer-B	2023	25.14	0.927

Note: I-Haze contains real indoor haze rather than synthetic haze. Consequently, PSNR and SSIM values are generally lower than those reported on synthetic benchmarks such as RESIDE-ITS. Cross-dataset comparison of absolute metric values should be avoided.

3.11 Known Limitations

Only 35 image pairs: I-Haze is one of the smallest dehazing datasets available. Performance estimates can vary noticeably because evaluation is based on a very limited number of scenes.
Single indoor environment: All scenes were captured using a similar laboratory setup. Models fine-tuned on I-Haze may overfit to the specific scene characteristics and lighting conditions.
Haze machine uniformity bias: Indoor air movement is limited, producing haze that is more spatially uniform than naturally occurring indoor smoke or aerosol events.
No depth information: The dataset does not include depth maps, transmission maps, atmospheric-light estimates, or haze-density annotations.
Limited scene diversity: Most scenes contain furniture, household objects, office materials, and indoor decorations, reducing environmental variability.

3.12 Research Angles for Final Year / PhD Students

Mixed indoor/outdoor dehazing: Train jointly on RESIDE-ITS, I-Haze, and O-Haze to determine whether combining indoor and outdoor real-haze data improves generalization.
Haze density estimation from image cues: Design a lightweight sub-network that estimates haze density before dehazing and adaptively controls restoration strength.
Colour correction post-dehazing: Investigate colour-cast removal techniques for indoor haze and evaluate perceptual improvements beyond PSNR and SSIM.
Benchmark instability study: Evaluate multiple models on I-Haze's small validation set and quantify confidence intervals for PSNR and SSIM measurements.
Domain adaptation from synthetic to real haze: Compare adversarial, contrastive, and self-supervised adaptation techniques using I-Haze as the target domain.
Indoor scene understanding after dehazing: Measure how object detection, segmentation, and depth-estimation performance changes before and after dehazing.

3.13 Quick Reference Card

I-Haze | 35 real indoor pairs | Real optical haze (haze machine) | Non-commercial research licence | Released: 2018 (CVPR Workshop) | Use case: Indoor real-haze evaluation | Primary metrics: PSNR, SSIM | Resolution: 2833×4256 (~12 MP) | Download: https://data.vision.ee.ethz.ch/cvl/ntire18/i-haze/

Dataset 4 — NH-Haze

4.1 Overview

NH-Haze (Non-Homogeneous Haze dataset) is the first dehazing benchmark specifically designed to model spatially varying real-world haze. Released by Ancuti et al. at the CVPR Workshop 2020, NH-Haze addresses a major limitation of earlier datasets such as O-Haze and I-Haze, which primarily contain relatively uniform haze distributions.

The dataset introduces realistic haze-density variations across different regions of the image, creating a significantly more challenging benchmark for image dehazing algorithms. NH-Haze better reflects real atmospheric conditions where haze density changes due to terrain, wind, obstacles, vegetation, and localized aerosol concentrations.

4.2 Origin and History

NH-Haze was developed by the Multimedia Lab at Hasselt University, Belgium, building directly on the experience gained from O-Haze and I-Haze.

The key innovation in NH-Haze is the generation of non-uniform haze distributions. Instead of producing a nearly homogeneous haze layer, the haze generation process deliberately varied haze concentration across different parts of the scene to mimic real atmospheric conditions.

The first version, NH-Haze (2020), was released as part of the NTIRE 2020 Dehazing Challenge. An expanded version, NH-Haze 2, was later introduced for the NTIRE 2021 challenge, providing additional scenes and increasing dataset diversity.

Today, both versions are widely used for evaluating the robustness of modern dehazing networks under realistic non-uniform haze conditions.

4.3 Haze Characteristics

NH-Haze contains real, non-homogeneous optical haze:

Non-uniform density: Haze concentration varies significantly across the image. Some regions may exhibit dense haze while other regions remain comparatively clear.
No global atmospheric-light assumption: Spatial haze variation violates the assumptions made by many classical dehazing methods that rely on a single atmospheric-light estimate.
Scene-depth correlation preserved: Although haze density varies, distant objects generally remain more obscured than nearby objects.
Natural appearance: The resulting haze resembles realistic morning mist, industrial haze, and environmental aerosol distributions more closely than previous benchmarks.
Increased restoration difficulty: Models must adapt to local haze conditions instead of applying uniform enhancement across the entire image.

4.4 Image Statistics

Attribute	Value
NH-Haze v1 pairs	55 (hazy + clean)
Training pairs (v1)	45
Validation pairs (v1)	5
Test pairs (v1, withheld)	5
NH-Haze v2 additional pairs	25
Resolution	~5000×3000 pixels (original)
Standard evaluation resolution	1600×1200 or 512×512
Haze type	Real, non-homogeneous outdoor haze
Scene content	Diverse outdoor environments including forests, streets, buildings, vegetation, and open fields
Colour space	RGB

4.5 Download and Access

NH-Haze v1 (NTIRE 2020):

https://data.vision.ee.ethz.ch/cvl/ntire20/nh-haze/
NH-Haze v2 (NTIRE 2021):

https://data.vision.ee.ethz.ch/cvl/ntire21/nh-haze2/
Papers with Code listing:

https://paperswithcode.com/dataset/nh-haze
GitHub repositories: Several repositories, including NHDehazing and NTIRE challenge implementations, provide download scripts, preprocessing tools, and benchmark code.

4.6 Dataset Metadata

Field	Detail
Official Download	https://data.vision.ee.ethz.ch/cvl/ntire20/nh-haze/
Published	June 2020 (CVPR Workshop)
License	Non-commercial research use
Authors	Codruta O. Ancuti, Cosmin Ancuti, Radu Timofte
File Size	~3.8 GB (v1)
Citation	Ancuti et al., "NH-HAZE: An Image Dehazing Benchmark with Non-Homogeneous Hazy and Haze-Free Images," CVPRW 2020

4.7 Licence

NH-Haze is available for non-commercial research and educational use. Citation of the CVPR Workshop 2020 paper is required when publishing results. The dataset is hosted by ETH Zurich as part of the NTIRE benchmark collection.

4.8 How Researchers Use NH-Haze

NH-Haze serves two major purposes in image dehazing research.

As a Hard Real-Haze Test Set

A model trained on synthetic datasets such as RESIDE and evaluated only on O-Haze or I-Haze may still perform reasonably well because those datasets contain relatively uniform haze distributions. NH-Haze acts as a significantly more challenging benchmark by introducing strong spatial variation in haze density across the image.

Researchers use NH-Haze to evaluate:

Robustness to spatially varying haze.
Generalization beyond homogeneous haze assumptions.
Local contrast restoration capability.
Adaptive atmospheric-light estimation methods.

As a Training Set for Non-Homogeneous Dehazing

With 45 training pairs in Version 1 and additional scenes introduced in Version 2, NH-Haze can also be used for fine-tuning and limited supervised training.

Common training strategies include:

Pre-train on RESIDE-OTS or RESIDE-ITS.
Fine-tune using NH-Haze training pairs.
Evaluate on NH-Haze validation or NTIRE challenge test sets.
Compare performance against O-Haze and I-Haze to measure real-world robustness.

Many recent transformer-based and diffusion-based dehazing models include NH-Haze as part of multi-dataset training protocols to improve real-haze generalization.

4.8 Code to Load NH-Haze

import os
import numpy as np
from PIL import Image
from glob import glob

def load_nhhaze_pairs(nhhaze_root, version=1, target_size=None):
    """
    Load NH-Haze hazy-clean pairs.

    Expected structure:
      nhhaze_root/
        hazy/   <- non-homogeneous hazy images
        GT/     <- clean ground truth
    """
    hazy_files = sorted(glob(os.path.join(nhhaze_root, 'hazy', '*.png')))
    if not hazy_files:
        hazy_files = sorted(glob(os.path.join(nhhaze_root, 'hazy', '*.jpg')))

    gt_files = sorted(glob(os.path.join(nhhaze_root, 'GT', '*.png')))
    if not gt_files:
        gt_files = sorted(glob(os.path.join(nhhaze_root, 'GT', '*.jpg')))

    pairs = []
    for hf, gf in zip(hazy_files, gt_files):
        hazy_img  = Image.open(hf).convert('RGB')
        clean_img = Image.open(gf).convert('RGB')

        if target_size:
            hazy_img  = hazy_img.resize(target_size, Image.LANCZOS)
            clean_img = clean_img.resize(target_size, Image.LANCZOS)

        pairs.append((
            np.array(hazy_img).astype(np.float32) / 255.0,
            np.array(clean_img).astype(np.float32) / 255.0
        ))

    print(f"Loaded {len(pairs)} NH-Haze v{version} pairs")
    return pairs

def visualise_transmission_map(hazy, clean, save_path=None):
    """
    Estimate and visualise approximate transmission map for NH-Haze analysis.
    Uses simplified DCP-based estimation for visualisation only.
    """
    import matplotlib.pyplot as plt

    A = np.percentile(hazy, 99.9, axis=(0,1))
    t_approx = 1.0 - np.min(hazy / A.clip(min=1e-6), axis=2)
    t_approx = t_approx.clip(0, 1)

    fig, axes = plt.subplots(1, 3, figsize=(15, 5))
    axes[0].imshow(hazy); axes[0].set_title('Hazy input')
    axes[1].imshow(clean); axes[1].set_title('Clean GT')
    axes[2].imshow(t_approx, cmap='jet'); axes[2].set_title('Approx. haze density')
    for ax in axes: ax.axis('off')

    if save_path:
        plt.savefig(save_path, dpi=150, bbox_inches='tight')
    plt.show()

4.10 State-of-the-Art Numbers on NH-Haze

Model	Year	PSNR (dB)	SSIM
DCP	2011	10.57	0.521
AOD-Net	2017	15.40	0.651
GDN	2019	13.80	0.520
FFA-Net	2020	19.87	0.692
AECR-Net	2021	19.88	0.720
DehazeFormer-B	2023	20.66	0.748
MB-TaylorFormer	2023	21.08	0.762

Note: NH-Haze is significantly more challenging than O-Haze and I-Haze because haze density varies spatially across each image. Consequently, PSNR and SSIM values are noticeably lower even for state-of-the-art methods.

4.11 Known Limitations

Only 55 image pairs (v1): NH-Haze remains a relatively small dataset, resulting in higher statistical variance compared to large-scale synthetic benchmarks.
Controlled non-homogeneity: Although haze density varies spatially, the haze is still generated using a professional haze machine. Real atmospheric haze may exhibit more complex dynamics and environmental interactions.
No depth maps or auxiliary metadata: The dataset does not provide transmission maps, depth maps, atmospheric-light estimates, or haze-density annotations.
Evaluation split inconsistency: Some publications evaluate using the 45-image training split while others report results on the 5-image validation split. Researchers should verify the exact evaluation protocol before comparing results across papers.
Limited environmental diversity: Although more diverse than O-Haze and I-Haze, the dataset still contains a restricted number of outdoor environments compared to real-world deployment scenarios.

4.12 Research Angles for Final Year / PhD Students

Non-homogeneous haze modeling: Extend the Atmospheric Scattering Model (ASM) by estimating a spatially varying atmospheric-light map A(x) rather than assuming a single global value.
Haze-region segmentation: Develop an attention module that explicitly identifies dense-haze and light-haze regions and applies adaptive restoration strategies.
Joint training with RESIDE: Investigate whether combining NH-Haze with RESIDE-OTS improves synthetic-to-real transfer performance and robustness.
Transformer attention analysis: Visualize attention maps in transformer-based dehazing networks to understand how different heads respond to varying haze densities.
Local atmospheric-light estimation: Compare global and patch-wise atmospheric-light estimation methods on NH-Haze.
Multi-scale non-homogeneous restoration: Study whether feature pyramids or hierarchical transformers better handle localized haze distributions.
Cross-dataset robustness evaluation: Train on O-Haze, I-Haze, and RESIDE, then test on NH-Haze to quantify robustness against spatial haze variation.

4.13 Quick Reference Card

NH-Haze | 55 real outdoor pairs (v1) | Real non-homogeneous optical haze | Non-commercial research licence | Released: 2020 (CVPR Workshop) | Use case: Hard real-haze benchmark | Primary metrics: PSNR, SSIM | Resolution: ~5000×3000 | Download: https://data.vision.ee.ethz.ch/cvl/ntire20/nh-haze/

Dataset 5 — Dense-Haze

5.1 Overview

Dense-Haze is a real-image dehazing dataset specifically designed to represent extremely dense atmospheric haze conditions. Released by Ancuti et al. at ICIP 2019, Dense-Haze was created to evaluate dehazing algorithms under scenarios where visibility is severely degraded and many conventional restoration methods fail.

The dataset served as the official benchmark for the NTIRE 2019 Image Dehazing Challenge and is widely regarded as one of the most difficult real-haze datasets available. Dense-Haze represents the upper extreme of haze severity in commonly used dehazing benchmarks.

5.2 Origin and History

Dense-Haze was developed by the Multimedia Lab at Hasselt University, Belgium, following the success of O-Haze and I-Haze. Researchers observed that previous datasets primarily contained moderate haze levels and did not adequately evaluate algorithm performance under severe visibility degradation.

To address this limitation, the haze generation process was modified to produce substantially denser haze concentrations while maintaining paired haze-free reference images. The resulting benchmark provides realistic examples of scenes affected by heavy fog, industrial haze, dense smoke-like conditions, and severe atmospheric scattering.

Dense-Haze became the primary evaluation dataset for the NTIRE 2019 Challenge on Image Dehazing and remains a standard benchmark for assessing algorithm robustness under challenging real-world haze conditions.

5.3 Haze Characteristics

Dense-Haze contains real, extremely dense optical haze:

Very low transmission values: Large portions of the image exhibit extremely low visibility, causing significant information loss.
Severe colour degradation: Dense haze introduces strong colour distortion and contrast reduction, particularly in distant and mid-range regions.
Difficult scene reconstruction: Recovering fine textures and realistic colours is substantially more challenging than in O-Haze, I-Haze, or NH-Haze.
Heavy atmospheric scattering: Multiple scattering effects become more pronounced, violating assumptions used by many classical dehazing methods.
Spatial variation remains: Although haze is generally dense across the image, some local variations remain due to scene geometry and environmental conditions.

5.4 Image Statistics

Attribute	Value
Total image pairs	55 (hazy + clean)
Training pairs	45
Validation pairs	5
Test pairs (withheld)	5
Resolution	~1600×1200 pixels
Haze type	Real, extremely dense outdoor haze
Scene content	Outdoor scenes including vegetation, buildings, roads, and structures
Colour space	RGB

5.5 Download and Access

Official Dense-Haze page (NTIRE 2019):

https://data.vision.ee.ethz.ch/cvl/ntire19/dense-haze/
Direct download: Available from the official page after accepting the dataset terms of use.
Papers with Code listing:

https://paperswithcode.com/dataset/dense-haze
GitHub repositories: Several open-source dehazing repositories provide preprocessing scripts, evaluation code, and dataset-loading utilities for Dense-Haze.

# Example download workflow
# Download link becomes available after accepting the dataset licence

wget -O dense_haze.zip \
"https://data.vision.ee.ethz.ch/cvl/ntire19/dense-haze/"

unzip dense_haze.zip -d ./Dense-Haze/

5.6 Dataset Metadata

Field	Detail
Official Download	https://data.vision.ee.ethz.ch/cvl/ntire19/dense-haze/
Published	September 2019 (ICIP 2019)
License	Non-commercial research use
Authors	Codruta O. Ancuti, Cosmin Ancuti, Radu Timofte, Luc Van Gool
File Size	~600 MB
Citation	Ancuti et al., "Dense-Haze: A Benchmark for Image Dehazing with Dense-Haze and Haze-Free Images," ICIP 2019

5.7 Licence

Dense-Haze is available for non-commercial research and educational use. Citation of the ICIP 2019 paper is required when publishing experimental results. The dataset is hosted by ETH Zurich's Computer Vision Laboratory alongside O-Haze, I-Haze, and NH-Haze.

5.8 How Researchers Use Dense-Haze

Dense-Haze is primarily used as an extreme-conditions real-haze benchmark because it represents the most challenging haze density among standard paired real-haze datasets.

Researchers commonly use Dense-Haze to:

Evaluate robustness under severe visibility degradation.
Measure performance when scene information is heavily obscured.
Compare classical prior-based methods against modern CNN and transformer architectures.
Analyze failure cases involving colour restoration, texture recovery, and structural reconstruction.

Typical observations include:

Models achieving ≥20 dB PSNR on Dense-Haze are generally considered strong performers.
Methods that perform well on O-Haze and I-Haze often experience noticeable degradation on Dense-Haze.
Classical methods such as DCP frequently struggle because dense haze violates many of their underlying assumptions.

A common evaluation protocol is:

Train on RESIDE-OTS or RESIDE-ITS.
Fine-tune using O-Haze, I-Haze, NH-Haze, and Dense-Haze training pairs.
Evaluate separately on each benchmark.
Report cross-dataset robustness and generalization metrics.

5.9 Code to Load Dense-Haze

import os
import numpy as np
from PIL import Image
from glob import glob
from skimage.metrics import peak_signal_noise_ratio as psnr
from skimage.metrics import structural_similarity as ssim

def load_densehaze_pairs(densehaze_root, target_size=(512, 512)):
    """
    Load Dense-Haze hazy-clean pairs.

    Expected structure:
      densehaze_root/
        hazy/   <- densely hazy images
        GT/     <- clean ground truth
    """
    hazy_files = sorted(glob(os.path.join(densehaze_root, 'hazy', '*.png')))
    if not hazy_files:
        hazy_files = sorted(glob(os.path.join(densehaze_root, 'hazy', '*.jpg')))
    gt_files = sorted(glob(os.path.join(densehaze_root, 'GT', '*.png')))
    if not gt_files:
        gt_files = sorted(glob(os.path.join(densehaze_root, 'GT', '*.jpg')))

    pairs = []
    for hf, gf in zip(hazy_files, gt_files):
        hazy  = Image.open(hf).convert('RGB')
        clean = Image.open(gf).convert('RGB')
        if target_size:
            hazy  = hazy.resize(target_size, Image.LANCZOS)
            clean = clean.resize(target_size, Image.LANCZOS)
        pairs.append((
            np.array(hazy).astype(np.float32) / 255.0,
            np.array(clean).astype(np.float32) / 255.0
        ))

    return pairs

def evaluate_densehaze_with_input_baseline(model_fn, pairs):
    """
    Evaluate dehazing model AND compute hazy-input PSNR for Dense-Haze.
    On very dense haze, some models perform WORSE than just passing
    the hazy image through — this comparison is informative.
    """
    model_psnr, input_psnr, ssim_list = [], [], []
    for hazy, clean in pairs:
        dehazed = model_fn(hazy)
        model_psnr.append(psnr(clean, dehazed, data_range=1.0))
        input_psnr.append(psnr(clean, hazy, data_range=1.0))
        ssim_list.append(ssim(clean, dehazed, data_range=1.0,
                               multichannel=True, channel_axis=2))

    print(f"Dense-Haze | Model PSNR: {np.mean(model_psnr):.2f} dB | "
          f"Input PSNR: {np.mean(input_psnr):.2f} dB | "
          f"Gain: {np.mean(model_psnr)-np.mean(input_psnr):+.2f} dB | "
          f"SSIM: {np.mean(ssim_list):.4f}")
    return model_psnr, input_psnr, ssim_list

5.10 State-of-the-Art Numbers on Dense-Haze

Model	Year	PSNR (dB)	SSIM
DCP	2011	10.06	0.382
AOD-Net	2017	13.14	0.414
DCPDN	2018	13.66	0.432
EPDN	2019	16.15	0.519
FFA-Net	2020	14.39	0.452
MSBDN	2020	15.37	0.491
AECR-Net	2021	15.80	0.466
DehazeFormer-B	2023	16.62	0.560

Note: Dense-Haze is one of the most challenging real-haze benchmarks available. Rankings often differ substantially from RESIDE, O-Haze, and I-Haze because severe haze causes significant information loss. Always report results across multiple datasets rather than relying on a single benchmark.

5.11 Known Limitations

Ground-truth recoverability ceiling: Under extremely dense haze, portions of scene information may be physically lost due to severe attenuation and scattering. No algorithm can perfectly recover details that were never captured by the camera.
Only 55 image pairs: Like other Hasselt University real-haze datasets, Dense-Haze remains relatively small and therefore exhibits higher statistical variance than large synthetic datasets.
Limited density diversity: Although haze is extremely dense, most images contain similar haze severity levels. The dataset does not provide the broad density variation seen in NH-Haze.
No auxiliary annotations: Depth maps, transmission maps, atmospheric-light estimates, and haze-density labels are not provided.
Evaluation instability: Because benchmark results are often computed on only a few validation images, small PSNR differences may not be statistically significant.

5.12 Research Angles for Final Year / PhD Students

Information-theoretic limits of dehazing: Estimate the maximum recoverable information under severe haze conditions and compare theoretical limits against current state-of-the-art performance.
Diffusion models for dense dehazing: Investigate whether diffusion-based restoration methods outperform CNN and transformer architectures under extreme visibility degradation.
Multi-image dense-haze restoration: Explore temporal aggregation, burst photography, or multi-view dehazing approaches to recover information unavailable in a single image.
Dense-haze pre-processing for downstream tasks: Evaluate whether dehazing improves object detection, semantic segmentation, or scene understanding performance under severe haze.
Perceptual quality versus PSNR: Analyze whether models with similar PSNR values produce noticeably different perceptual quality, colour fidelity, and texture restoration.
Adaptive atmospheric-light estimation: Develop local atmospheric-light estimation techniques that remain stable under very dense scattering conditions.
Cross-benchmark robustness study: Train on RESIDE, O-Haze, I-Haze, and NH-Haze, then evaluate on Dense-Haze to identify which training strategies best generalize to extreme haze.

5.13 Quick Reference Card

Dense-Haze | 55 real outdoor pairs | Real extremely dense optical haze | Non-commercial research licence | Released: 2019 (ICIP) | Use case: Extreme-haze benchmark | Primary metrics: PSNR, SSIM | Resolution: ~1600×1200 | Download: https://data.vision.ee.ethz.ch/cvl/ntire19/dense-haze/

Image Dehazing Metrics Explained

PSNR — Peak Signal-to-Noise Ratio

PSNR measures the ratio of maximum signal power to distortion power in decibels. In dehazing, it compares the dehazed image against the clean ground truth:

PSNR = 10 × log₁₀(MAX² / MSE)

Higher PSNR = better dehazing. Typical ranges: 30-40+ dB on RESIDE-SOTS (synthetic), 20-25 dB on O-Haze/I-Haze (real mild), 19-21 dB on NH-Haze (real non-homogeneous), 14-17 dB on Dense-Haze (extreme). Never compare absolute PSNR across datasets — the haze difficulty fundamentally changes the achievable score.

SSIM — Structural Similarity Index

SSIM evaluates luminance, contrast, and structural similarity simultaneously, producing a score between 0 and 1 (higher = better). In dehazing, SSIM is particularly informative because it detects halo artefacts and contrast errors that PSNR tolerates. A model with high PSNR but low SSIM is typically over-brightening the image globally without restoring structural detail.

LPIPS — Learned Perceptual Image Patch Similarity

LPIPS uses deep network feature distances to measure perceptual similarity (lower = better). Increasingly reported in dehazing papers to detect the over-saturation and plastic-looking texture artefacts common in GAN-based methods. A dehazing model that scores 1 dB lower PSNR but has significantly better LPIPS is usually producing more visually pleasing results.

CIEDE2000 — Colour Difference Metric

CIEDE2000 measures perceptual colour difference between the dehazed and ground truth images, accounting for human colour perception nonlinearities. Particularly relevant for dehazing because haze introduces a strong colour cast (whitish or yellowish) that PSNR and SSIM may not penalise correctly. Lower CIEDE2000 = better colour fidelity.

FADE — Fog Aware Density Evaluator

FADE is a no-reference metric designed specifically for fog and haze assessment. It estimates haze density from the image without a clean reference, producing a score where lower = clearer image. FADE is useful for evaluating on unpaired real hazy images like RTTS, where no ground truth exists. Its limitations include poor calibration for non-natural images and sensitivity to over-brightening (which artificially reduces estimated haze density).

NIQE — Naturalness Image Quality Evaluator

NIQE measures deviation from natural image statistics without requiring a reference. In dehazing, NIQE helps detect when a model produces an unnaturally sharp or unnaturally smooth result — both common failure modes. Lower NIQE = more natural appearance.

Which Metric for Which Task?

Scenario	Recommended Metrics
Synthetic haze (RESIDE SOTS)	PSNR + SSIM
Real mild haze (O-Haze, I-Haze)	PSNR + SSIM + LPIPS
Non-homogeneous haze (NH-Haze)	PSNR + SSIM + CIEDE2000
Dense haze (Dense-Haze)	PSNR + SSIM + visual comparison
Unpaired real haze (RTTS)	FADE + NIQE (no-reference)
Full research paper	PSNR + SSIM + LPIPS + FADE

Comparison Table — All 5 Datasets Across 12+ Attributes

Attribute	RESIDE	O-Haze	I-Haze	NH-Haze	Dense-Haze
Year	2018/2019	2018	2018	2020	2019
# Pairs	313K+ (OTS) / 14K (ITS)	45	35	55	55
Haze Type	Synthetic	Real (machine)	Real (machine)	Real, non-homogeneous	Real, dense
Setting	Indoor + Outdoor	Outdoor	Indoor	Outdoor	Outdoor
Ground Truth	Synthetic (exact)	Yes (captured)	Yes (captured)	Yes (captured)	Yes (captured)
Haze Density	Configurable	Medium-dense	Medium	Non-uniform	Extreme
Resolution	460×620 to 1024+	4964×3312	2833×4256	~5000×3000	~1600×1200
Depth Maps	Yes (ITS)	No	No	No	No
Use as Train	Yes (primary)	No (too small)	No (too small)	No (borderline)	No
Use as Test	Yes (SOTS)	Yes	Yes	Yes	Yes
Licence	Non-commercial	Non-commercial	Non-commercial	Non-commercial	Non-commercial
Typical PSNR	35–40 dB	22–24 dB	23–25 dB	19–21 dB	14–17 dB
Challenge	NTIRE 2018/2019	NTIRE 2018/2019	NTIRE 2018/2019	NTIRE 2020/2021	NTIRE 2019

How to Choose the Right Dataset

By Haze Type

You are studying haze physics and want controlled conditions: Use RESIDE with synthetic ASM haze. You control A and β precisely and can isolate their effects.

You need to demonstrate real-world applicability at moderate haze: Use O-Haze (outdoor) or I-Haze (indoor) for evaluation. Train on RESIDE, evaluate on both.

You are targeting UAV, road scene, or satellite imagery in real hazy conditions: Use NH-Haze — its non-homogeneous structure best matches real atmospheric conditions over varied terrain.

You are working on extreme conditions (wildfire smoke, thick industrial fog, dense coastal mist): Use Dense-Haze as your primary benchmark.

By Domain

Autonomous driving: RESIDE-OTS (outdoor synthetic) + NH-Haze (non-homogeneous) + RTTS (real road scenes, unpaired). Road scenes need both far-field dehazing (for navigation) and near-field accuracy (for pedestrian detection).

Indoor CCTV/surveillance: RESIDE-ITS + I-Haze. Indoor depth ranges and artificial lighting conditions are specific to this domain.

Medical imaging / endoscopy: None of the five datasets are directly applicable. Turbid medium scattering in tissue is governed by different physics. Use these datasets for pre-training only.

Satellite/aerial remote sensing: RESIDE-OTS provides the largest training set. NH-Haze best approximates heterogeneous cloud/haze cover patterns.

By Compute Budget

GPU-constrained (<8 GB VRAM): Train on RESIDE-ITS only (14K pairs, easily fits). Evaluate on SOTS-Indoor + I-Haze + O-Haze.

Standard research (24 GB VRAM): Train on RESIDE-ITS + OTS subset (100K pairs). Evaluate on full five-dataset suite.

Large-scale (multi-GPU): Use full RESIDE-OTS (313K pairs) + data augmentation + multi-scale training. Evaluate on all five datasets plus RTTS qualitative.

Common Dehazing Models Benchmarked

DCP — Dark Channel Prior (He et al., CVPR 2009 / TPAMI 2011): The foundational prior-based method. Exploits the observation that at least one colour channel has near-zero intensity in most haze-free patches. Computationally efficient and requires no training data. Still the mandatory baseline. Fails on sky regions and white objects where the dark channel prior breaks.

DehazeNet (Cai et al., IEEE TIP 2016): The first CNN-based dehazing method. Learns the transmission map from hazy image patches using a shallow convolutional network. Pioneered the end-to-end learning paradigm for dehazing.

AOD-Net (Li et al., ICCV 2017): Reformulates the ASM to directly estimate a unified parameter K that captures both A and t in a single network pass. Extremely lightweight and fast — good for embedded deployment despite lower PSNR than modern methods.

GFN — Gated Fusion Network (Ren et al., CVPR 2018): Multi-scale feature fusion with learned gating. Strong performance on O-Haze and I-Haze at the time of publication.

FFA-Net (Qin et al., AAAI 2020): Feature Fusion Attention Network — channel and pixel attention mechanisms allow the network to adaptively weight features by their importance for haze removal. Significant jump over prior methods on RESIDE-SOTS.

MSBDN (Dong et al., CVPR 2020): Multi-Scale Boosted Dehazing Network with dense feature fusion. Strong on RESIDE outdoor scenes.

AECR-Net (Wu et al., CVPR 2021): Adaptive Enhancement and Contrastive Regularisation. Uses a contrastive loss between hazy and clean features to improve perceptual quality alongside PSNR.

DehazeFormer (Song et al., IEEE TIP 2023): Transformer-based architecture with K-space normalisation and a modified window attention mechanism. Currently achieves the best published results on RESIDE-SOTS-Indoor (40+ dB) and strong results across all real-haze benchmarks.

How to Prepare Hazy-Clean Pairs for Training

Synthetic Haze Generation

If you want to create your own training data or augment existing datasets with synthetic haze:

import numpy as np
from PIL import Image
import cv2

def synthesise_haze_asm(clean_img, depth_map, beta_range=(0.05, 0.20), 
                         A_range=(0.7, 1.0)):
    """
    Generate a hazy image using the Atmospheric Scattering Model.

    Args:
        clean_img: float32 numpy array [H, W, 3] in [0, 1]
        depth_map: float32 numpy array [H, W] — scene depth, normalised to [0, 1]
        beta_range: scattering coefficient range
        A_range: atmospheric light range (per channel)

    Returns:
        hazy: synthesised hazy image
        t_map: transmission map
        A: atmospheric light vector
    """
    beta = np.random.uniform(*beta_range)
    A = np.random.uniform(A_range[0], A_range[1], size=(1, 1, 3)).astype(np.float32)

    # Normalise depth to a reasonable physical range (e.g., 0–10m)
    depth = depth_map * 10.0

    # Compute transmission map
    t_map = np.exp(-beta * depth).astype(np.float32)
    t_map = np.clip(t_map, 0.1, 1.0)  # prevent t=0 (complete obstruction)
    t_map = t_map[:, :, np.newaxis]   # shape [H, W, 1]

    # Apply ASM: I = J*t + A*(1-t)
    hazy = clean_img * t_map + A * (1.0 - t_map)
    hazy = np.clip(hazy, 0, 1).astype(np.float32)

    return hazy, t_map.squeeze(), A.squeeze()

def generate_hazy_dataset(clean_images, depth_maps, num_per_image=3):
    """
    Generate multiple hazy versions of each clean image using different A, β.
    Mimics RESIDE ITS generation where 10 hazy variants exist per clean image.
    """
    pairs = []
    for clean, depth in zip(clean_images, depth_maps):
        for _ in range(num_per_image):
            hazy, t_map, A = synthesise_haze_asm(clean, depth)
            pairs.append({
                'hazy': hazy, 
                'clean': clean, 
                't_map': t_map, 
                'A': A
            })
    return pairs

Real Haze Dataset Preprocessing

For O-Haze, I-Haze, NH-Haze, and Dense-Haze:

def preprocess_real_haze_pairs(pairs, patch_size=256, stride=128, 
                                 min_haze_score=0.01):
    """
    Extract patches from real haze pairs with quality filtering.
    Filters out near-sky patches where haze dominates without scene content.
    """
    filtered_patches = []

    for hazy, clean in pairs:
        # Skip patches where clean image is nearly uniform (sky, wall)
        # and hazy image adds almost no structure
        H, W, _ = hazy.shape

        for y in range(0, H - patch_size + 1, stride):
            for x in range(0, W - patch_size + 1, stride):
                hazy_p  = hazy[y:y+patch_size, x:x+patch_size]
                clean_p = clean[y:y+patch_size, x:x+patch_size]

                # Quality filter: discard near-uniform patches
                clean_std = np.std(clean_p)
                if clean_std < 0.03:  # uniform patch (sky/wall)
                    continue

                # Haze filter: discard patches where haze is minimal
                haze_diff = np.mean(hazy_p) - np.mean(clean_p)
                if haze_diff < min_haze_score:
                    continue

                filtered_patches.append((hazy_p, clean_p))

    print(f"Extracted {len(filtered_patches)} quality patches")
    return filtered_patches

Augmentation for Dehazing

import random

def augment_hazy_pair(hazy, clean):
    """
    Apply identical augmentation to hazy-clean pair.
    Dehazing-specific: avoid colour jitter on clean (distorts ground truth).
    """
    # Geometric augmentation (applied identically)
    if random.random() > 0.5:
        hazy  = hazy[:, ::-1, :].copy()
        clean = clean[:, ::-1, :].copy()
    if random.random() > 0.5:
        hazy  = hazy[::-1, :, :].copy()
        clean = clean[::-1, :, :].copy()
    k = random.randint(0, 3)
    hazy  = np.rot90(hazy, k).copy()
    clean = np.rot90(clean, k).copy()

    # Haze intensity augmentation (only applied to hazy, not clean)
    # Randomly scale the haze density to augment training distribution
    if random.random() > 0.7:
        alpha = random.uniform(0.8, 1.2)  # scale haze intensity
        # Move toward or away from the clean image
        hazy_aug = clean + alpha * (hazy - clean)
        hazy = np.clip(hazy_aug, 0, 1).astype(np.float32)

    return hazy, clean

Research Gap Radar — 5 Open Problems

Gap 1 — Synthetic-to-Real Domain Transfer

The community's best models achieve 40+ dB on RESIDE-SOTS-Indoor but only ~25 dB on I-Haze. This 15 dB gap represents a fundamental failure of synthetic-trained models on real data.

No paper has yet demonstrated a principled solution that closes this gap without fine-tuning on real image pairs. Domain adaptation, physics-informed augmentation, and self-supervised approaches are active but unresolved research directions.

Gap 2 — Non-Homogeneous Haze Modelling

The Atmospheric Scattering Model (ASM) assumes a single global atmospheric light A — an assumption that breaks down in many real-world environments.

NH-Haze highlighted this limitation by introducing spatially varying haze distributions. Current SOTA performance on NH-Haze remains substantially below performance on homogeneous-haze benchmarks.

Designing architectures that explicitly model spatially varying A(x) and β(x) remains an open research problem.

Gap 3 — Video Dehazing with Temporal Consistency

Most dehazing models operate on individual images.

When applied frame-by-frame to video, temporal artifacts such as flickering, colour instability, and inconsistent restoration often appear.

Adjacent video frames contain highly correlated haze information. Exploiting temporal consistency for stable video dehazing remains significantly underexplored compared to single-image dehazing.

Applications include:

Autonomous driving
UAV surveillance
Traffic monitoring
Video enhancement systems

Gap 4 — Dehazing for Downstream Tasks

Most image dehazing research optimizes PSNR and SSIM.

However, many real-world applications care more about downstream performance than reconstruction quality.

Examples include:

Object detection
License plate recognition
Semantic segmentation
Pedestrian detection
Autonomous navigation

The relationship between improved PSNR and improved task performance remains inconsistent. Task-aware dehazing loss functions and joint optimization frameworks remain active research areas.

Gap 5 — Nighttime and Coloured Haze

Nearly all standard dehazing datasets contain daytime scenes.

Nighttime haze introduces additional challenges:

Multiple light sources
Spatially varying illumination
Headlight scattering
Streetlight glow
Colour-dependent atmospheric effects

The standard ASM does not adequately model these phenomena.

Developing benchmark datasets, physical models, and deep learning architectures specifically for nighttime dehazing remains a major open problem.

Implementation Roadmap — 8-Step Week-by-Week Guide

This roadmap targets a Final Year / M.Tech student with 2–3 months before submission and a single GPU (8–24 GB VRAM).

Week 1 — Environment and Baseline

Install PyTorch 2.x, clone BasicIR or the FFA-Net repository, and reproduce DCP results on RESIDE-SOTS-Indoor.

Target baseline:

~16 dB (DCP)
If you can get AOD-Net working: ~20 dB

These numbers confirm your evaluation pipeline is functioning correctly.

Week 2 — Dataset Setup

Download:

RESIDE-ITS (training)
SOTS (testing)

Tasks:

Implement the data loader from Section 12.
Compute average hazy-image statistics (mean brightness, contrast, estimated β distribution).
Visualise 10 hazy-clean image pairs.
Download O-Haze and I-Haze for later evaluation.

Week 3 — Train a Baseline CNN

Train:

FFA-Net, or
A simplified MSBDN model on RESIDE-ITS.

Target:

~30 dB on SOTS-Indoor after approximately 100 epochs.

If your model does not reach 30 dB, review:

Patch extraction
Data augmentation
Input normalization
Learning-rate scheduling

Week 4 — Implement Your Contribution

Possible contributions:

A new attention mechanism (channel + spatial jointly)
A transmission-map estimation branch
A contrastive-learning component inspired by AECR-Net
A Swin Transformer replacement block

Implement the proposed component and validate it independently.

Week 5 — Integrate and Train Full Model

Train the complete architecture from scratch.

Tasks:

Save checkpoints periodically.
Evaluate on SOTS-Indoor and SOTS-Outdoor every 50 epochs.
Track both PSNR and SSIM.
Monitor training stability and convergence.

Week 6 — Cross-Dataset Evaluation

Run the trained model directly (zero-shot, no fine-tuning) on:

O-Haze
I-Haze
NH-Haze
Dense-Haze

Record all metrics and observations.

This is usually where model weaknesses become visible and where the research narrative begins to emerge.

Week 7 — Ablation Studies

Remove or disable one component at a time and retrain.

Examples:

Remove attention module.
Remove transmission branch.
Remove contrastive loss.
Replace custom block with baseline block.

For each ablation:

Measure ΔPSNR.
Measure ΔSSIM.
Compare computational cost.

This section is often the most important part of the paper for reviewers.

Week 8 — Paper and Submission

Tools and Frameworks

BasicIR: The image restoration framework from XiPixelGroup, the same team behind BasicSR. Specifically includes dehazing models (FFA-Net, AECR-Net) alongside other restoration tasks. Clean codebase, well-documented training configs.

Repository:

https://github.com/XPixelGroup/BasicIR

BasicSR: The broader image restoration framework covering denoising, super-resolution, and dehazing. Includes training scripts, loss functions, and metric evaluation utilities.

Repository:

https://github.com/XPixelGroup/BasicSR

DehazeFormer official code:

https://github.com/IDKiro/DehazeFormer

Includes training scripts for RESIDE and all five evaluation datasets. Best starting point for transformer-based dehazing.

OpenCV: For image I/O, colour space conversion, and the DCP baseline implementation.

pip install opencv-python

OpenCV includes a built-in dehazing module (cv2.createCLAHE) that can serve as a simple baseline.

scikit-image: PSNR, SSIM, and image quality metrics.

pip install scikit-image

IQA-PyTorch: Unified metric library for PSNR, SSIM, LPIPS, NIQE, FADE, and CIEDE2000. Essential for comprehensive evaluation.

Repository:

https://github.com/chaofengc/IQA-PyTorch

pip install pyiqa

FADE metric implementation:

https://github.com/Utkarsh-Deshmukh/Fog-Aware-Density-Evaluator

Python implementation of the FADE no-reference fog density metric.

7 Common Mistakes Researchers Make

Mistake 1 — Reporting Only RESIDE-SOTS Numbers

A paper that reports 40 dB on SOTS-Indoor but does not evaluate on any real-haze dataset will face justified reviewer pushback. The community has largely moved to requiring at least one real-haze benchmark alongside RESIDE.

Always include O-Haze or I-Haze results.

Mistake 2 — Confusing SOTS-Indoor and SOTS-Outdoor

Outdoor dehazing appears easier because test sets contain different visibility levels. SOTS-Outdoor typically yields 3–5 dB higher PSNR than SOTS-Indoor for the same model.

Reporting "SOTS PSNR: 36 dB" without specifying indoor or outdoor is ambiguous and should be avoided.

Mistake 3 — Evaluating at Wrong Resolution

O-Haze, I-Haze, NH-Haze, and Dense-Haze are captured at high resolutions (approximately 2K–16 MP), but almost all studies evaluate at 512×512 or 1024×1024.

Some papers resize differently, leading to non-comparable numbers.

Always report the evaluation resolution.

Mistake 4 — Training and Testing on the Same Real-Haze Dataset

With only 45 pairs in O-Haze, some researchers mistakenly use all 45 pairs for both training and evaluation.

This inflates benchmark numbers.

The official training/validation/test split and evaluation protocol should always be respected.

Mistake 5 — Ignoring the Sky Region Problem

DCP and many CNN methods produce visible artefacts in sky regions (over-darkening in DCP, bright halos in CNNs).

If qualitative examples only show vegetation or textured regions, reviewers may notice the omission.

Include at least one challenging sky-containing image in qualitative evaluations.

Mistake 6 — Claiming SOTA Without Real-Haze Evaluation

Achieving 40+ dB on RESIDE-SOTS-Indoor is impressive but no longer sufficient to claim state-of-the-art performance.

Recent research places substantial emphasis on real-haze robustness.

Ensure your proposed method is evaluated on at least NH-Haze or Dense-Haze.

Mistake 7 — Using PSNR Alone to Compare Methods on Dense-Haze

On Dense-Haze, PSNR ranges are often narrow (approximately 10–17 dB), making small differences difficult to interpret.

Always supplement PSNR with:

SSIM
LPIPS
Visual side-by-side comparisons

A model achieving 15.5 dB with SSIM = 0.55 and visually strong texture restoration may be preferable to a model achieving 16.0 dB with SSIM = 0.45 and blurry outputs.

Your Next Steps + Conclusion

A Practical Action Plan

You now have a complete blueprint for image dehazing research. Here is how to convert it into output.

If you are a Final Year B.Tech student: Start with RESIDE-ITS and FFA-Net. Get a PSNR of approximately 30 dB on SOTS-Indoor. Then add O-Haze evaluation — that cross-dataset result is what makes a thesis examiner approve the work confidently.

If you are an M.Tech student with 6 months: Follow the 8-week roadmap. The key differentiator at this level is the cross-dataset evaluation table. Showing results on all five datasets — synthetic, real mild, real non-homogeneous, and real dense — demonstrates thoroughness that distinguishes publication-ready work from average academic projects.

If you are a PhD student: Engage directly with the Research Gap Radar. The synthetic-to-real gap (Gap 1) and non-homogeneous haze modelling (Gap 2) remain the two largest unresolved challenges in image dehazing research. Gap 4 (dehazing for downstream tasks) is particularly underexplored and has natural conference relevance for venues such as TIP, CVPR, ICCV, ECCV, and AAAI.

What the Community Has Learned

The history of image dehazing benchmarks is essentially a progression toward increasingly realistic models.

RESIDE established the first large-scale benchmark and revealed that deep learning could outperform traditional prior-based methods on synthetic data.
O-Haze and I-Haze exposed the synthetic-to-real gap.
NH-Haze demonstrated that spatially varying haze significantly increases difficulty and challenges assumptions used by many traditional models.
Dense-Haze highlighted the fundamental limitations of existing approaches under severe visibility degradation.

Each benchmark contributed a specific piece to the overall research story. Researchers who understand the progression from RESIDE → O-Haze/I-Haze → NH-Haze → Dense-Haze are better equipped to position their work, anticipate reviewer concerns, and identify meaningful research contributions.

The datasets discussed in this guide remain highly relevant. Future improvements are unlikely to come solely from achieving marginal gains such as 0.2 dB PSNR improvements on SOTS. More impactful contributions will come from understanding why current methods fail under real haze and developing architectures that explicitly address those limitations.