<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Cecilia Ngunjiri</title>
    <description>The latest articles on DEV Community by Cecilia Ngunjiri (@cecilia_ngunjiri).</description>
    <link>https://dev.to/cecilia_ngunjiri</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2821177%2Fbee8497c-cd64-4a5c-b6fe-7fd455004fbc.png</url>
      <title>DEV Community: Cecilia Ngunjiri</title>
      <link>https://dev.to/cecilia_ngunjiri</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cecilia_ngunjiri"/>
    <language>en</language>
    <item>
      <title>Classification Problem: Predicting Values in Group Column</title>
      <dc:creator>Cecilia Ngunjiri</dc:creator>
      <pubDate>Mon, 24 Feb 2025 10:46:25 +0000</pubDate>
      <link>https://dev.to/cecilia_ngunjiri/classification-problem-predicting-values-in-group-column-1gee</link>
      <guid>https://dev.to/cecilia_ngunjiri/classification-problem-predicting-values-in-group-column-1gee</guid>
      <description>&lt;p&gt;Project link : &lt;a href="https://github.com/CessNgunjiri/Project/blob/main/cecilia.ipynb" rel="noopener noreferrer"&gt;https://github.com/CessNgunjiri/Project/blob/main/cecilia.ipynb&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;&lt;br&gt;
The project involves a classification problem. The objective is to predict the values in the "group" column, which can either be "control" or "patient."&lt;/p&gt;

&lt;p&gt;Dataset Columns and Descriptions&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;rownames: A unique identifier for each record in the dataset, often used as an index.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;subject: Represents the identifier or label for the individuals or entities being studied.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;age: Indicates the age of the subjects in the dataset.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;group: The target variable for the classification problem, categorizing each subject as either "control" or "patient."&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Data Understanding/Inspection&lt;/p&gt;

&lt;p&gt;I started by importing the Python libraries required for this project, which include pandas, seaborn, and matplotlib.pyplot&lt;br&gt;
The data has 5 columns and 945 rows. &lt;/p&gt;

&lt;p&gt;Data Cleaning&lt;/p&gt;

&lt;p&gt;The dataset was clean and there were no missing values in any of the columns. It also has no duplicates. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Checking for Outliers&lt;/em&gt;&lt;br&gt;
Using a box plot to check the outliers in the dataset. There are notable outliers in the exercise column. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaqmh41qzw3byhzqiezv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaqmh41qzw3byhzqiezv.png" alt="Image description" width="467" height="841"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To check the number of outliers, I calculated the interquartile range. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xnhe44pfvgqb8uwci6m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9xnhe44pfvgqb8uwci6m.png" alt="Image description" width="800" height="251"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decided to remove the outliers because there are only 83 outliers out of 945 rows. Removing them still maintains the integrity of the data without removing a huge part of the data. &lt;/p&gt;

&lt;p&gt;To ensure data integrity, I decided to remove non-numeric characters in the subject column. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flmwk393l201i7c6xa4s8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flmwk393l201i7c6xa4s8.png" alt="Image description" width="800" height="80"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Exploratory Data Analysis&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Performed both Univariate Analysis and Bi-Variate Analysis. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjphowpr5xv97ndv49uji.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjphowpr5xv97ndv49uji.png" alt="Image description" width="752" height="823"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5iwa3bsh5rlnrioc0uqc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5iwa3bsh5rlnrioc0uqc.png" alt="Image description" width="737" height="310"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38r93ol3i45lej8po9jc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F38r93ol3i45lej8po9jc.png" alt="Image description" width="797" height="756"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz4dtrsglac9cpjefh16n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz4dtrsglac9cpjefh16n.png" alt="Image description" width="709" height="822"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmx62tjlpzgraje93onaw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmx62tjlpzgraje93onaw.png" alt="Image description" width="524" height="272"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Data Preprocessing&lt;/p&gt;

&lt;p&gt;&lt;em&gt;One Hot Encoding&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;One hot encoded the control and patient entries in the group column.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoyltcbkn6xjkt8yfzx7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffoyltcbkn6xjkt8yfzx7.png" alt="Image description" width="593" height="364"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Standard Scaling&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6cbok1z04h5uh09uqc9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm6cbok1z04h5uh09uqc9.png" alt="Image description" width="600" height="298"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Handling Class Imbalance using SMOTE&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faun0p8qjsenuu8yw4pt4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faun0p8qjsenuu8yw4pt4.png" alt="Image description" width="605" height="515"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Data Splitting&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Split the data into training and testing. 80% of the data for training and 20% of the data for testing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiv6jakstgof1al731vb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbiv6jakstgof1al731vb.png" alt="Image description" width="575" height="204"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modeling&lt;/strong&gt;&lt;br&gt;
&lt;em&gt;Baseline Model: Logistic Regression Model&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A baseline model to help provide a reference point to evaluate the performance of more complex models, helping to set realistic expectations, identify potential issues, and ensure that advanced techniques offer meaningful improvements.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkgjyj6kkmmm6whzviqe9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkgjyj6kkmmm6whzviqe9.png" alt="Image description" width="601" height="643"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Other Models:&lt;br&gt;
Random Forest Classifier&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk77vna6c70r9qtbsu658.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk77vna6c70r9qtbsu658.png" alt="Image description" width="732" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Random Forest Classifier Evaluation:&lt;br&gt;
              precision    recall  f1-score   support&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       0       1.00      1.00      1.00        66
       1       1.00      1.00      1.00       123

accuracy                           1.00       189
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;macro avg       1.00      1.00      1.00       189&lt;br&gt;
weighted avg       1.00      1.00      1.00       189&lt;/p&gt;

&lt;p&gt;Precision:&lt;br&gt;
Precision measures how many of the predicted positives were actually correct. Class 0 (control): 100% of the predictions for control were correct. Class 1 (patient): 99% of the predictions for patient were correct.&lt;/p&gt;

&lt;p&gt;Recall:&lt;br&gt;
Recall tells us how many of the actual positives were correctly identified. Class 0 (control): 98% of actual control cases were identified. Class 1 (patient): 100% of actual patient cases were identified.&lt;/p&gt;

&lt;p&gt;F1-Score:&lt;br&gt;
F1-Score is a balance of precision and recall, providing a single score. Class 0 (control): 99%. Class 1 (patient): 100%.&lt;/p&gt;

&lt;p&gt;Support:&lt;br&gt;
Support is the number of actual cases in the dataset. Class 0 (control): 59 instances. Class 1 (patient): 121 instances.&lt;/p&gt;

&lt;p&gt;Accuracy: Accuracy tells us the overall proportion of correct predictions. Accuracy: 99%. The model got 99% of the predictions correct.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Decision Tree Model&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2k6yye3o7ner7unhjkyo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2k6yye3o7ner7unhjkyo.png" alt="Image description" width="727" height="152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Decision Tree Classifier Evaluation:&lt;br&gt;
              precision    recall  f1-score   support&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       0       1.00      1.00      1.00        66
       1       1.00      1.00      1.00       123

accuracy                           1.00       189
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;macro avg       1.00      1.00      1.00       189&lt;br&gt;
weighted avg       1.00      1.00      1.00       189&lt;/p&gt;

&lt;p&gt;Decision Tree Model Accuracy is 100%&lt;/p&gt;

&lt;p&gt;Class 0 (Control Group):&lt;/p&gt;

&lt;p&gt;Precision (100%):&lt;/p&gt;

&lt;p&gt;When the model says “This is Class 0,” it’s right 100% of the time.&lt;/p&gt;

&lt;p&gt;Recall (100%):&lt;/p&gt;

&lt;p&gt;Out of all the actual Class 0 samples, it correctly identified 100% of them.&lt;/p&gt;

&lt;p&gt;F1-Score (100%):&lt;/p&gt;

&lt;p&gt;This is a balance between precision and recall. It combines how accurate and how thorough the model is for Class 0.&lt;/p&gt;

&lt;p&gt;Class 1 (Patient Class):&lt;/p&gt;

&lt;p&gt;Precision (100%):&lt;/p&gt;

&lt;p&gt;When the model says “This is Class 1,” it’s right 100% of the time.&lt;/p&gt;

&lt;p&gt;Recall (100%):&lt;/p&gt;

&lt;p&gt;Out of all the actual Class 1 samples, it correctly identified 100% of them.&lt;/p&gt;

&lt;p&gt;F1-Score (100%):&lt;/p&gt;

&lt;p&gt;This combines precision and recall for Class 1 to give a single measure of how well it’s doing.&lt;/p&gt;

&lt;p&gt;Class 0 (Control Group):&lt;/p&gt;

&lt;p&gt;The model is great at finding most Class 0 samples (100% recall).&lt;/p&gt;

&lt;p&gt;Class 1 (Patient Group):&lt;/p&gt;

&lt;p&gt;The model is very confident when predicting Class 1 (100% precision).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;K-Nearest Neighbors Model&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwh1acdsmbyl68s7voopr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwh1acdsmbyl68s7voopr.png" alt="Image description" width="730" height="150"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;K-Nearest Neighbors Evaluation:&lt;br&gt;
              precision    recall  f1-score   support&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       0       1.00      0.97      0.98        66
       1       0.98      1.00      0.99       123

accuracy                           0.99       189
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;macro avg       0.99      0.98      0.99       189&lt;br&gt;
weighted avg       0.99      0.99      0.99       189&lt;/p&gt;

&lt;p&gt;K-Nearest Neighbors Model Accuracy is 99%&lt;/p&gt;

&lt;p&gt;Class 0 (Control Group):&lt;/p&gt;

&lt;p&gt;Precision (99%):&lt;/p&gt;

&lt;p&gt;When the model says “This is Class 0,” it’s right 99% of the time.&lt;/p&gt;

&lt;p&gt;Recall (100%):&lt;/p&gt;

&lt;p&gt;Out of all the actual Class 0 samples, it correctly identified 100% of them.&lt;/p&gt;

&lt;p&gt;F1-Score (99%):&lt;/p&gt;

&lt;p&gt;This is a balance between precision and recall. It combines how accurate and how thorough the model is for Class 0.&lt;/p&gt;

&lt;p&gt;Class 1 (Patient Class):&lt;/p&gt;

&lt;p&gt;Precision (99%):&lt;/p&gt;

&lt;p&gt;When the model says “This is Class 1,” it’s right 99% of the time.&lt;/p&gt;

&lt;p&gt;Recall (100%):&lt;/p&gt;

&lt;p&gt;Out of all the actual Class 1 samples, it correctly identified 100% of them.&lt;/p&gt;

&lt;p&gt;F1-Score (100%):&lt;/p&gt;

&lt;p&gt;This combines precision and recall for Class 1 to give a single measure of how well it’s doing.&lt;/p&gt;

&lt;p&gt;Class 0 (Control Group):&lt;/p&gt;

&lt;p&gt;The model is great at finding most Class 0 samples (98% recall).&lt;/p&gt;

&lt;p&gt;Class 1 (Patient Group):&lt;/p&gt;

&lt;p&gt;The model is very confident when predicting Class 1 (100% precision).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The best model to use to predict the group column, is the Decision Tree Model.&lt;/p&gt;

&lt;p&gt;It has high accuracy of 100% and can predict the control group or patient group 100% accurately&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>python</category>
      <category>firstpost</category>
      <category>jupyter</category>
    </item>
  </channel>
</rss>
