A surprisingly effective lightweight sentiment analysis approach for product reviews in Python

#ai #python

I’ve been working on sentiment analysis for a large set of e-commerce product reviews recently. The original plan was to build a more advanced NLP pipeline, but for early-stage classification I ended up testing a much simpler lexicon-based approach first.

Honestly, I was surprised by how well it performed for basic positive/negative review detection.

Here’s a stripped-down version:

import re

positive_words = {
'good', 'great', 'excellent',
'amazing', 'love', 'perfect',
'fantastic', 'wonderful'
}

negative_words = {
'bad', 'terrible', 'awful',
'hate', 'poor', 'horrible',
'worst', 'disappointing'
}

def simple_sentiment(text):
words = re.findall(r'\w+', text.lower())

pos_count = sum(

    1 for w in words

    if w in positive_words

)

neg_count = sum(

    1 for w in words

    if w in negative_words

)

if pos_count > neg_count:

    return 'positive'

elif neg_count > pos_count:

    return 'negative'

return 'neutral'

Example usage

reviews = [
"This product is amazing and works perfectly!",
"Terrible quality, I hate it.",
"It's okay, does the job."
]

for review in reviews:
print(
f"'{review}' -> "
f"{simple_sentiment(review)}"
)

A few things I learned while testing:

simple heuristics can go surprisingly far
preprocessing quality matters more than expected
sarcasm and mixed sentiment break naive models quickly
domain-specific vocabulary heavily affects accuracy

For larger datasets, I eventually moved toward pre-trained sentiment APIs and transformer-based models for better nuance detection, but this lightweight approach was incredibly useful for prototyping and bulk filtering.

Curious what other people here are using for sentiment analysis in Python these days:

TextBlob?
VADER?
spaCy?
Hugging Face transformers?
custom fine-tuned models?

Would love to hear what’s worked best for real-world review datasets.