“Can you get high accuracy in sentiment analysis without touching deep learning?”
That's the question that sparked my curiosity — and led to a project that amazed me with its results.
In this blog, I’ll walk you through my journey of building a sentiment analysis system using only Machine Learning — no neural networks, no transformers, just classic ML — and still achieving an impressive accuracy of 89.12%.
📌 Problem Statement
The goal: Sentiment Analysis — classifying text as either positive or negative.
Rather than relying on RNNs, LSTMs, or BERT, I challenged myself to stay within the boundaries of classical machine learning algorithms.
🧪 Dataset
I used the IMDb movie reviews dataset:
- 40,000 training reviews
- 10,000 testing reviews
- Binary labels: positive / negative
🧹 Preprocessing Steps
- Lowercasing text
- Removing HTML tags
- Cleaning punctuation
- Tokenization
- Stopword removal (using NLTK)
- Stemming with Porter Stemmer
- Vectorization using TF-IDF
⚙️ Baseline Model: GaussianNB
To begin, I tested the simplest model possible: Gaussian Naive Bayes.
📈 Accuracy: 82.00%
This gave me a quick baseline — but I knew I could push further.
🔍 Model Experiments
I tested multiple models:
from sklearn.naive_bayes import MultinomialNB, BernoulliNB
from sklearn.linear_model import LogisticRegression, RidgeClassifier, SGDClassifier, PassiveAggressiveClassifier
from sklearn.svm import LinearSVC
clf2 = MultinomialNB(alpha=1.0)
clf3 = BernoulliNB(alpha=1.0)
clf4 = LogisticRegression(solver='saga', max_iter=1000)
clf5 = LinearSVC(max_iter=5000)
clf6 = SGDClassifier(loss='log_loss', max_iter=1000, tol=1e-3)
clf7 = RidgeClassifier(alpha=1.0, solver='auto')
clf8 = PassiveAggressiveClassifier(max_iter=1000, tol=1e-3, early_stopping=False)
🧪 GridSearchCV for Tuning
I applied GridSearchCV on the top 4 models:
# Logistic Regression
param_grid = {'C': [0.01, 0.1, 1, 5, 10], 'solver': ['saga'], 'max_iter': [1000]}
# Best: {'C': 0.1, 'solver': 'saga'} => 88.7%
# LinearSVC
param_grid = {'C': [0.01, 0.1, 1, 5, 10], 'max_iter': [5000]}
# Best: {'C': 0.01} => 88.645%
# MultinomialNB
param_grid = {'alpha': [0.01, 0.1, 0.5, 1.0, 5.0]}
# Best: {'alpha': 1.0} => 85.37%
# SGDClassifier
param_grid = {
'alpha': [0.0001, 0.001, 0.01],
'loss': ['hinge', 'log_loss'],
'penalty': ['l2', 'l1', 'elasticnet'],
'max_iter': [1000]
}
# Best: {'alpha': 0.001, 'loss': 'log_loss', 'penalty': 'l2'} => 88.75%
🧪 Final Accuracy after Retraining
🧠 Stacking Ensemble
To boost performance further, I implemented stacking:
- Base Models: SGD, LogisticRegression, LinearSVC
- Meta Model: LogisticRegression
📈 Stacking Accuracy: 89.12%
🧪 Tried Deep Stacking
Tried multi-layer stacking:
- Layer 1: Logistic, LinearSVC, MultinomialNB, SGD
- Layer 2: LogisticRegression
- Layer 3: RidgeClassifier, SGD → Final VotingClassifier
But… accuracy slightly dropped:
📉 Accuracy: 89.09%
🔁 I reverted to the 1-layer stacking, which performed best.
🔧 Tools & Libraries Used
- Python
- Scikit-learn
- Pandas
- NLTK
- Matplotlib / Seaborn
🎯 Key Learnings
- Traditional ML can still compete with deep learning in text tasks
- Logistic Regression + TF-IDF = surprisingly powerful
- Ensemble methods like stacking can push the limits
📝 Final Words
This project wasn't about beating deep learning. It was about challenging assumptions — and proving that, with the right setup, classic ML still holds its ground.
If you're new to ML or want to understand the fundamentals before diving into deep models, this path is for you.
Feel free to connect, share thoughts, or collaborate. I’d love to hear your feedback!
repo link : https://github.com/anikchand461/sentiment-analysis
Top comments (0)