Chronic Kidney Disease (CKD) is a global health concern affecting millions of people worldwide. Early detection is crucial, as timely intervention can significantly reduce the need for dialysis and kidney transplants.
NephroPredict is a machine learning-based project designed to provide an efficient solution for early CKD detection using clinical and laboratory data.
Project Overview
- Goal: Early detection of CKD using predictive machine learning models
- Dataset: Chronic Kidney Disease dataset (UCI Repository)
- Approach: Implemented multiple ML models such as Logistic Regression, KNN, SVM, Decision Tree, and Random Forest
- Tuning: Applied hyperparameter tuning (GridSearchCV) to improve model performance
- Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and Cross-validation scores
Attributes Used
The dataset consists of important medical attributes, including:
Blood Pressure (Bp), Specific Gravity (Sg), Albumin (Al), Sugar (Su), Red Blood Cells (Rbc), Blood Urea (Bu), Serum Creatinine (Sc), Sodium (Sod), Potassium (Pot), Hemoglobin (Hemo), White Blood Cell Count (Wbcc), Red Blood Cell Count (Rbcc), Hypertension (Htn), Class (Target Variable)
Models & Results
The following models were implemented and compared before and after hyperparameter tuning:
- Logistic Regression
- K-Nearest Neighbors (KNN)
- Support Vector Machine (SVM)
- Decision Tree
- Random Forest
Key Findings
- Logistic Regression and KNN showed slight improvement after tuning, but also signs of overfitting
- SVM significantly improved after tuning, highlighting the importance of kernel and regularization
- Decision Tree accuracy dropped after tuning, showing sensitivity to depth and splitting criteria
- Random Forest consistently achieved the best results with 100% training accuracy and 96.55% test accuracy, making it the most robust model
Conclusion: Random Forest is the most reliable model for this dataset and is recommended for CKD prediction.
📈 Visualization
The project includes professional visualizations such as styled tables and bar charts for comparing model performance.
Repository
The complete source code, dataset, and notebook are available on GitHub:
🔗 Machine Learning for Chronic Kidney Disease Detection
👨💻 Author
Developed by Abubakar Shabbir
📜 License
This project is licensed under the MIT License – feel free to use, modify, and share.
✨ If you found this project helpful, don’t forget to leave a ⭐ on GitHub!
Top comments (0)