Raghav Senthil Kumar

Posted on May 27 • Originally published at Medium

Week 1 of 13: EcoBin - A Web Application for Waste Classification

#ai #tensorflow #deeplearning #webdev

Introduction

One week ago, I posted that I'd be completing 13 projects over my 13 weeks before college. I'm thrilled to announce that I've finished Week 1 by building EcoBin: A Web Application for Waste Classification.

View on Vercel: https://eco-bin-sepia.vercel.app/

EcoBin is an AI-powered waste classification platform that helps people sort their waste correctly and develop better recycling habits. Waste is a massive problem. In the United States alone, over 292 million tons of waste are generated each year, and that number is expected to double by 2050. Yet less than a third of our waste stream is actually recycled, largely because roughly a quarter of the items in our recycling bin aren't actually recyclable.

EcoBin addresses the waste management problem by using computer vision to automatically identify and categorize waste items with up to ~95% accuracy. It also includes a flashcard quiz with over 100 questions to help people test and build their recycling knowledge. Specifically, the app offers two features to help users make better disposal decisions.

The first is a free AI tool in the Scan Waste Item tab. Upload a photo of the item you want to dispose of and EcoBin runs its two-stage neural network to identify the object, assess its condition, and tell you how to dispose of it.

The second is the Quiz Yourself tab. We have over 100 flashcards in our database covering a wide range of waste items. Each quiz pulls up to 10 random cards and asks you to guess how each item should be disposed of. After every question, you get an explanation of why your answer was right or wrong, and you get a full results summary at the end of the quiz. You can retake the quiz up to 10 times.

Why recycling needs AI in the first place

The Environmental Protection Agency estimates that as much as 75% of our waste stream could be recycled. However, our actual recycling yield rate sits at 32.1% - slightly below 1/3. This gap stems from two problems:

Wish-Cycling: The practice of placing non-recyclable items like plastic bags and styrofoam into the recycling bin in the hope that they can be recycled in the future. Unfortunately, these items cannot be recycled. Instead, they contaminate legitimate recyclables, present hazards, damage machinery and raise processing costs at Material Recovery Facilities.
Misclassification: Although machines are able to classify waste with astonishingly high accuracy, a critical flaw remains. All existing waste classification models are trained to classify items based on their material composition. This approach fails to account for contamination. Consider the example of a pizza-stained cardboard box. A waste classification model may determine that the pizza-stained cardboard box is materially recyclable because it is made out of cardboard; however it fails to account for the contamination of the grease and food residue on the box that make it suited for the garbage bin instead.

I built EcoBin to solve this "contamination" problem. I isolated the problem into two parts, building two separate neural networks (Stage A and Stage B). I then combined them into one single waste classification pipeline at the end.

Stage A: Base Waste Classifier - Stage A classifies the item based on its material composition. It's a transfer learning model built on EfficientNetV2-S, pretrained on ImageNet and fine-tuned on the Recyclable and Household Waste Classification Dataset from Kaggle, which contains over 15,000 images across 30 classes covering plastic, paper, cardboard, glass, metal, organic waste, and textiles. Each class is mapped to one of four disposal pathways (garbage, curbside recycling, drop-off recycling, or compost) using the recycling guidelines for the City of Phoenix.
Stage B: Contamination Classifier - Stage B runs only when Stage A routes an item to either curbside recycling or drop-off recycling. It checks whether the item is clean enough to recycle or whether contamination (food residue, grease, mold, paint, etc.) should redirect it to garbage. Stage B is trained on a synthetic dataset of 9,000 images, generated by segmenting clean recyclables from the Stage A dataset using U2-Net and then compositing contamination textures from the Waste Contamination Textures Dataset onto each object surface.

Stage A: Base Waste Classifier

Waste Classification Dataset

Stage A is trained on the Recyclable and Household Waste Classification Dataset from Kaggle. I chose this dataset because it includes a wide range of waste categories with over 15,000 images across 30 classes covering plastic, paper, cardboard, glass, metal, organic waste, and textiles. Every class has exactly 500 images, so we don't have to worry about assigning class weights as well.

Each class is also split into two subcategories: default (clean, studio-like images of waste) and real_world (in-the-wild photos with varied lighting). This split lets us measure how well the model generalises beyond studio conditions and we can assess this difference in image quality when evaluating the results of the Stage A: Base Waste Classifier.

Mapping Classes to Disposal Pathway

The United States has over 19,000 municipalities and every one of them has different recycling rules. To keep things simple, I used the recycling guidelines for the City of Phoenix (my hometown) to map each of the 30 classes to one of four disposal pathways. You can view the mapping of each class in the python dictionary I created in the code cell below.

# Initializes a dictionary mapping each class of waste within the dataset to a disposal pathway
DISPOSAL_MAP = {

    # Curbside Recycling: 17 classes
    'plastic_soda_bottles':       'curbside_recycling',
    'aerosol_cans':               'curbside_recycling',
    'steel_food_cans':            'curbside_recycling',
    'cardboard_boxes':            'curbside_recycling',
    'glass_beverage_bottles':     'curbside_recycling',
    'plastic_cup_lids':           'curbside_recycling',
    'cardboard_packaging':        'curbside_recycling',
    'glass_food_jars':            'curbside_recycling',
    'aluminum_food_cans':         'curbside_recycling',
    'plastic_food_containers':    'curbside_recycling',
    'magazines':                  'curbside_recycling',
    'aluminum_soda_cans':         'curbside_recycling',
    'plastic_detergent_bottles':  'curbside_recycling',
    'newspaper':                  'curbside_recycling',
    'office_paper':               'curbside_recycling',
    'plastic_water_bottles':      'curbside_recycling',
    'glass_cosmetic_containers':  'curbside_recycling',

    # Drop-off Recycling: 4 classes
    'plastic_shopping_bags':      'dropoff_recycling',
    'plastic_trash_bags':         'dropoff_recycling',
    'clothing':                   'dropoff_recycling',
    'shoes':                      'dropoff_recycling',

    # Compost: 4 classes
    'eggshells':                  'compost',
    'coffee_grounds':             'compost',
    'tea_bags':                   'compost',
    'food_waste':                 'compost',

    # Garbage: 5 classes
    'disposable_plastic_cutlery': 'garbage',
    'styrofoam_cups':             'garbage',
    'styrofoam_food_containers':  'garbage',
    'plastic_straws':             'garbage',
    'paper_cups':                 'garbage',
}

Most plastics, papers, metals, and glass go to curbside recycling. Plastic bags, styrofoam, and textiles go to drop-off recycling. Tea bags, eggshells, and food waste go to compost. Disposable cutlery, paper cups, and plastic straws go to garbage.

Stage A Training

Stage A is a transfer learning model built on top of EfficientNetV2-S, a convolutional neural network pretrained on ImageNet. I used transfer learning with ImageNet weights instead of training from scratch, so the model can pick up waste-specific patterns much faster and with far less data than it would otherwise need.

Training happens in two phases.

In Phase 1, the EfficientNetV2-S backbone stays frozen and only the new classification head learns. This is fast because almost all of the weights are locked, and it gets the head into a reasonable starting point without disturbing the pretrained features.

In Phase 2, I unfreeze the top half of the backbone and let it adapt slightly to waste imagery. The learning rate follows a warmup-cosine schedule that peaks at 1e-4. Every BatchNormalization layer stays frozen during Phase 2 to prevent the running statistics from drifting on the smaller batch sizes I was using. You can see the training strategy in code below.

#===========================
# Phase 1: Warm Up the Head
#===========================

# Backbone fully frozen, only the new classification head trains. Fast convergence.
print("Phase 1 - warming up head (backbone frozen)")
base_model.trainable = False
model_a.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
    loss=tf.keras.losses.CategoricalCrossentropy(label_smoothing=0.1),  # Smoothing softens the one-hot targets
    metrics=["accuracy"],
)

PHASE1_EPOCHS = 15
history_p1 = model_a.fit(
    train_ds,
    epochs=PHASE1_EPOCHS,
    validation_data=val_ds,
    callbacks=[
        checkpoint_cb,
        EarlyStopping(monitor="val_accuracy", patience=5,
                      restore_best_weights=True, verbose=1),
        ReduceLROnPlateau(monitor="val_loss", factor=0.5,
                          patience=2, min_lr=1e-4, verbose=1),
    ],
    verbose=1,
)
phase1_epochs_run = len(history_p1.history["accuracy"])  # Remember how many epochs Phase 1 actually ran

#=================================================
# Phase 2: Fine-Tune the Top Half of the Backbone
#=================================================

"""Unfreeze the top half of the backbone but keep BatchNorm in inference mode.
Freezing BN stops it from drifting on the small fine-tune batches, which is
what keeps the accuracy curve climbing through the transition rather than dipping."""

print("\nPhase 2 - fine-tuning top half of backbone (BN frozen, cosine LR)")
base_model.trainable = True
total_layers   = len(base_model.layers)
unfreeze_from  = total_layers // 2
for layer in base_model.layers[:unfreeze_from]:
    layer.trainable = False                          # Bottom half stays frozen
freeze_batchnorm(base_model)                         # Every BN layer locked

trainable_count = sum(1 for l in base_model.layers if l.trainable)
print(f"  unfrozen backbone layers: {trainable_count} of {total_layers}")

PHASE2_EPOCHS   = 25
steps_per_epoch = tf.data.experimental.cardinality(train_ds).numpy()
total_steps     = int(steps_per_epoch) * PHASE2_EPOCHS
warmup_steps    = int(steps_per_epoch) * 3           # 3-epoch warmup before cosine decay kicks in

lr_schedule = WarmupCosineSchedule(
    start_lr=1e-6,
    peak_lr=1e-4,
    end_lr=1e-6,
    warmup_steps=warmup_steps,
    total_steps=total_steps,
)
model_a.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=lr_schedule),
    loss=tf.keras.losses.CategoricalCrossentropy(label_smoothing=0.1),
    metrics=["accuracy"],
)

history_p2 = model_a.fit(
    train_ds,
    epochs=PHASE2_EPOCHS,
    validation_data=val_ds,
    callbacks=[
        checkpoint_cb,
        EarlyStopping(monitor="val_accuracy", patience=8,
                      restore_best_weights=True, verbose=1),
    ],
    verbose=1,
)

#=====================================
# Merge Phase 1 and Phase 2 Histories
#=====================================

"""Stitch the two history dicts together so plotting code in section 2.6 sees a
single continuous curve. Only the keys present in both phases get merged
(Phase 2 omits learning_rate logging because of the schedule object)."""

class _History:
    pass
history_a = _History()
shared_keys = history_p1.history.keys() & history_p2.history.keys()
history_a.history = {
    k: history_p1.history[k] + history_p2.history[k]
    for k in shared_keys
}
print(f"\nPhase 1 ran {phase1_epochs_run} epochs; "
      f"Phase 2 ran {len(history_p2.history['accuracy'])} epochs")
print(f"Best Stage A checkpoint saved to: {MODELS_PATH / 'stage_a_best.keras'}")

Stage A: Base Waste Classifier Results Evaluation

Below you can see the error metrics graphs for Stage A.

They tell a very interesting story. Validation accuracy climbed steadily through Phase 1 and continued improving in the first few epochs of Phase 2, but plateaued around 86–87% while training accuracy kept climbing to 96%, creating a growing gap that indicates overfitting in the later epochs of Phase 2.
The Stage A Base Waste Classifier achieved an overall accuracy rate of 87.3%. You can see the individual breakdown of accuracy, precision, recall and F1 for each of the classes below in the bar chart.

Some classes such as cardboard packaging, aluminum food cans and steel food cans present very low accuracy rates and per-class metrics; however, we can consider this negligible because we are training our model to classify an item based on its disposal pathway - not to classify what the item is. Once we group the classes into one of the 4 disposal pathways, accuracy jumps to 96.3%.

The adjusted bar chart shows that low accuracy rates for classes like cardboard_packaging don't matter because the model most often classifies it with cardboard_box and both of them go to curbside recycling anyway. The same can be said for other underperforming classes such as steel food cans and aluminum food cans.

Finally, an interesting feature of the dataset we used is that it allows us to see the difference in the accuracy of the model in real_world vs. default test images. As expected the model performs much better with default images as you can see in the table below.

Stage B: Contamination Classifier

The Synthetic Data Problem

Stage B needs labeled examples of contaminated recyclables: a glass jar with food residue, a cardboard box with grease stains, a plastic container with mold. After searching the web extensively, I couldn't find a dataset that fit its purpose, so I had to synthetically generate my own.

The generation pipeline has three steps for each source image.

Step 1: Segment the object - I used U2-Net via the rembg library to produce a soft alpha mask for the object in each photo. A small morphological closing fills tiny interior holes (glass bottles are notorious for these), and a Gaussian blur feathers the edges of the mask so the contamination patch fades naturally at the object boundary instead of cutting off sharply. The code cells below display how object segmentation was accomplished.

#===========================
# Object Segmentation Helper
#===========================

def get_object_mask(img: np.ndarray) -> np.ndarray:
    """
    Runs U2-Net on the input image and returns a soft alpha mask for the
    object. The mask is cleaned up with a morphological closing (to fill small
    holes in transparent or reflective surfaces) and then feathered with a
    Gaussian blur so contamination patches fade naturally at the boundary
    instead of cutting off sharply.
    """

    buf = io.BytesIO()
    Image.fromarray(img).save(buf, format='PNG')
    output = remove(buf.getvalue(), session=REMBG_SESSION)
    alpha  = np.array(Image.open(io.BytesIO(output)).convert('RGBA'))[:, :, 3]
    alpha  = alpha.astype(np.float32) / 255.0
    # Fill small interior holes left by transparent or reflective surfaces
    binary = ndimage.binary_closing(
        alpha > 0.5, structure=np.ones((5, 5))
    ).astype(np.float32)
    # Feather edges so composited contamination fades at the object boundary
    return ndimage.gaussian_filter(binary, sigma=2)

Step 2: Paste a contamination texture - I scraped roughly 40 contamination texture images from various sources, grouped into 8 contaminant types, which you can view on the Waste Contamination Textures Dataset on Kaggle. For each source image, I randomly selected a texture and alpha-blended it with the underlying pixels using the object mask. The three-way weight (texture alpha times object mask times opacity scalar) ensures the texture is pasted only on the object in the image and not in the background. You can view a visual sample below.

Step 3: Save with metadata - Each generated image gets logged in a manifest CSV with the source class, contaminant subgroup, severity level (light: 1 contamination stain, medium: 2 contamination stains, heavy: 3 contamination stains), and a source_stem that links contaminated copies back to their clean original. The code cell below shows how the synthetic data generation loop works.

#========================
# Source Image Selection
#========================

records           = []
contaminant_types = sorted(texture_pool.keys())
assert len(contaminant_types) == 8, \
    f"Expected 8 contaminant types, found {len(contaminant_types)}: {contaminant_types}"

# Collect every source image grouped by class so we can stratified-sample
images_by_class = {}
for cls in RECYCLABLE_CLASSES:
    cls_dir = DATASET_PATH / cls
    if not cls_dir.exists():
        print(f"WARNING: {cls} not found, skipping")
        continue
    images_by_class[cls] = sorted(p for p in cls_dir.rglob("*") if p.is_file())

# Allocate the 1000-image quota proportional to each class's share of the pool
total_available = sum(len(v) for v in images_by_class.values())
target_total    = min(MAX_IMAGES, total_available)
rng             = np.random.default_rng(SEED)

selected = []
for cls, files in images_by_class.items():
    share = round(target_total * len(files) / total_available)
    share = min(share, len(files))
    if share == 0:
        continue
    idx = rng.choice(len(files), size=share, replace=False)
    selected.extend((cls, files[i]) for i in idx)

# Trim or top up to hit target_total exactly (rounding can drift by a couple)
if len(selected) > target_total:
    selected = selected[:target_total]
elif len(selected) < target_total:
    needed = target_total - len(selected)
    leftover = [(cls, f) for cls, files in images_by_class.items() for f in files
                if (cls, f) not in set(selected)]
    selected.extend(rng.choice(leftover, size=min(needed, len(leftover)), replace=False).tolist())

random.Random(SEED).shuffle(selected)
n_sources = len(selected)
print(f"Sources : {n_sources:,}")
print(f"Expected outputs: {n_sources * 9:,} ({n_sources:,} clean + {n_sources * 8:,} contaminated)")
print(f"Textures: {contaminant_types}")
print()

#========================
# Generation Loop
#========================

for class_name, img_path in tqdm(selected, desc="Generating"):
    source_stem = f"{img_path.parent.name}_{img_path.stem}"

    base_img = np.array(
        Image.open(img_path).convert("RGB").resize((IMG_SIZE, IMG_SIZE), Image.LANCZOS)
    )

    # Save the clean copy first so even if the contamination step fails we still have the unmodified source
    clean_dir  = STAGE_B_PATH / "clean" / class_name
    clean_dir.mkdir(parents=True, exist_ok=True)
    clean_path = clean_dir / f"{source_stem}.png"
    Image.fromarray(base_img).save(clean_path, format="PNG")

    records.append({
        "image_path":   str(clean_path),
        "label":        "clean",
        "subgroup":     "none",
        "level":        "none",
        "source_class": class_name,
        "source_stem":  source_stem,
    })

    # rembg is the expensive call; run it once per source then reuse the mask for every contaminated variant
    mask = get_object_mask(base_img)

    # One contaminated copy for every texture type with severity picked randomly
    for contaminant_type in contaminant_types:
        textures = texture_pool[contaminant_type]
        level    = random.choice(list(CONTAMINATION_LEVELS.keys()))
        n_patches = CONTAMINATION_LEVELS[level]

        result_img = base_img.copy()
        for _ in range(n_patches):
            tex_path = random.choice(textures)
            texture  = load_texture(tex_path)
            alpha    = random.uniform(0.45, 0.75)
            result_img = apply_texture_patch(result_img, texture, mask, alpha=alpha)

        cont_dir  = STAGE_B_PATH / "contaminated" / contaminant_type / level
        cont_dir.mkdir(parents=True, exist_ok=True)
        cont_path = cont_dir / f"{source_stem}_{contaminant_type}_{level}.png"
        Image.fromarray(result_img).save(cont_path, format="PNG")

        records.append({
            "image_path":   str(cont_path),
            "label":        "contaminated",
            "subgroup":     contaminant_type,
            "level":        level,
            "source_class": class_name,
            "source_stem":  source_stem,
        })

print(f"\nGenerated {len(records):,} images")

Total output: 9,000 images (1,000 clean source images, 8 contaminated copies of each, one per contaminant type).

Stage B Training and Results Evaluation

Stage B sits on top of the same EfficientNetV2-S backbone as Stage A and inherits all of its visual features. The only thing that changes is the classification head: it's swapped out for a new 9-class softmax that distinguishes clean recyclables from the 8 contaminant subgroups.

Validation accuracy consistently outperforms training accuracy for Stage B, which is good because it means the model isn't overfitting. Moreover, we can see the implementation of Phase 2 was much more effective here in Stage B than it was in Stage A after the clear spike in train and val accuracy after the unfreeze.

Stage B presented an overall accuracy rate of ~85%. You can view the breakdowns of the accuracy, precision, recall and f1 metrics for each class in the bar chart above. However, because Stage B was trained on a synthetic dataset, I believe that it's accuracy rate is inflated and it is expected to drop when dealing with real world contaminated recyclables.

Deployment to Vercel

The Vercel frontend is a Next.js 15 app with three tabs:
About: Context regarding why I built EcoBin.

Scan Waste Item: photo upload with drag-and-drop and inference via the HuggingFace Space API that determines what disposal pathway (garbage, curbside recycling, drop-off recycling, compost) your waste item belongs to.
Quiz Yourself: a 10-question flashcard quiz drawn from a database of 100+ waste items, with per-question feedback and a results summary at the end.

The inference backend is a FastAPI server running in a Docker container on HuggingFace Spaces. It exposes a single POST /infer endpoint that accepts a base64-encoded image, applies a face detection privacy gate (so faces in the frame get rejected before anything else runs), always runs Stage A on every input image, conditionally runs Stage B if Stage A determines the item is recyclable, and returns a structured disposal recommendation.

Challenges and What's Next for EcoBin

The biggest problems with EcoBin are the two domain gaps:

Stage A's domain gap: Stage A's training results were skewed toward the default test images (you can scroll up and view the bar graph for the exact difference). This means that EcoBin struggles to visualize and predict disposal pathways for real-world items that you the user may upload. Moreover, EcoBin was trained on a relatively small dataset of only 30 classes - there are much more categories of waste out there. Therefore, if you input an item that isn't in EcoBin's training, it might just be plain wrong. To fix this, I'm going to add a feedback button to the app so users can flag wrong predictions, and use those corrections as fine-tuning data.
Stage B's domain gap: The synthetic contamination data doesn't generalize too well. Synthetic contamination data is just too different from how contamination looks in the real world. To fix this problem, I will create a small dataset by taking photographs of real-world contaminated recyclables within my household. Moreover, I'll train Stage B on an object detection model so it can detect traces of contamination within an object.
I'm also planning to add support for more municipalities beyond Phoenix, so the app can adapt to local recycling rules. The big challenge remains the contamination dataset however, which will require lots of effort to photograph and create a real-world dataset of enough size to train the model.

Links

Vercel App: https://eco-bin-sepia.vercel.app/
GitHub: https://github.com/Raghavsk24/EcoBin
Kaggle: https://www.kaggle.com/code/ragbag84/ecobin-two-stage-waste-classification-pipeline

That's week 1!

Thanks for reading,
Raghav

DEV Community