DEV Community

Cover image for 🩺 NephroPredict: Machine Learning for Chronic Kidney Disease Detection
AbuBakar Shabbir
AbuBakar Shabbir

Posted on

🩺 NephroPredict: Machine Learning for Chronic Kidney Disease Detection

Chronic Kidney Disease (CKD) is a global health concern affecting millions of people worldwide. Early detection is crucial, as timely intervention can significantly reduce the need for dialysis and kidney transplants.

NephroPredict is a machine learning-based project designed to provide an efficient solution for early CKD detection using clinical and laboratory data.


Project Overview

  • Goal: Early detection of CKD using predictive machine learning models
  • Dataset: Chronic Kidney Disease dataset (UCI Repository)
  • Approach: Implemented multiple ML models such as Logistic Regression, KNN, SVM, Decision Tree, and Random Forest
  • Tuning: Applied hyperparameter tuning (GridSearchCV) to improve model performance
  • Evaluation Metrics: Accuracy, Precision, Recall, F1-score, and Cross-validation scores


Attributes Used

The dataset consists of important medical attributes, including:

Blood Pressure (Bp), Specific Gravity (Sg), Albumin (Al), Sugar (Su), Red Blood Cells (Rbc), Blood Urea (Bu), Serum Creatinine (Sc), Sodium (Sod), Potassium (Pot), Hemoglobin (Hemo), White Blood Cell Count (Wbcc), Red Blood Cell Count (Rbcc), Hypertension (Htn), Class (Target Variable)


Models & Results

The following models were implemented and compared before and after hyperparameter tuning:

  • Logistic Regression
  • K-Nearest Neighbors (KNN)
  • Support Vector Machine (SVM)
  • Decision Tree
  • Random Forest

Key Findings

  • Logistic Regression and KNN showed slight improvement after tuning, but also signs of overfitting
  • SVM significantly improved after tuning, highlighting the importance of kernel and regularization
  • Decision Tree accuracy dropped after tuning, showing sensitivity to depth and splitting criteria
  • Random Forest consistently achieved the best results with 100% training accuracy and 96.55% test accuracy, making it the most robust model

Conclusion: Random Forest is the most reliable model for this dataset and is recommended for CKD prediction.


📈 Visualization

The project includes professional visualizations such as styled tables and bar charts for comparing model performance.


Repository

The complete source code, dataset, and notebook are available on GitHub:

🔗 Machine Learning for Chronic Kidney Disease Detection


👨‍💻 Author

Developed by Abubakar Shabbir


📜 License

This project is licensed under the MIT License – feel free to use, modify, and share.


If you found this project helpful, don’t forget to leave a ⭐ on GitHub!

Top comments (0)