DEV Community: user

Common myths about coding you should ignore

user — Tue, 22 Sep 2020 00:04:55 +0000

So today I'm gonna be busting myths about programming because I hear a lot of people view coding as a way that isn't reality.
Let's start:

• Programming requires a lot of math

A lot of people think that to become a good programmer you need to become good at math and this makes a lot of people that want to enter the field of programming very discouraged. For general programming, you need basic math like; addition, subtraction, division, multiplication, and other basic math we've learned. Just know that there are some areas of programming that require more math.

• It's a young man's game

It doesn't matter if you are 30, 40, 50. When it comes to coding, age is not a factor. Anyone can learn to code. What matters is how much effort you are willing to put into it.

• It's for the anti-social

Wrong, coding is for everyone. During coding, not only are you communicating with your computer but you would also need to communicate with other developers when you need help with your code. A career as a developer is a very social one. While working on projects either in your team or by yourself, you will need to exchange thoughts and ideas with others. Surely you will spend a good amount of your time-solving problems by yourself. All coding projects involve a great deal of intense logical thinking and brainstorming but when you need help there would always be a community of your fellow developers to help you.

• I need to be very smart to learn to code

What matters about coding is your motivation and hard work. To become a better programmer what matters most is your consistency. That is why I advise new programmers to participate in the 100DaysOfCode challenge, it helps to increase consistency and familiarity with programming.

• I need special qualifications to get into coding

This assumption could not be further from the truth. What you need is basic computer skills like how to use your mouse, typing, and others. The truth is that it's actually when you start coding that you'll probably learn more about what you can do with your computer.

Final thoughts

For any myth I missed out, please drop it in the comments.

My advice is to ignore these myths because they could be discouraging which is not good for your mental state. Coding is an insanely valuable skill to learn and it could change your life for the better. Don't hesitate to learn it.

Feel free to check out my socials:

GOOD LUCK 👍

Why you should learn Git

user — Fri, 18 Sep 2020 23:04:36 +0000

Normally when you google for things like:

Tips on how to be a better programmer

You're gonna see Teamwork. This is where Git comes in.

What is Git?

Imagine you are coloring on a flower coloring book. You colored in green for all leaves and now it’s time for the best part, coloring the petal. You know you enjoy red the best but it looked horrible after you finished it. With Git, you can revert your choice of red in a heartbeat and you are free to reapply the red if you change your mind. A work doesn’t have to be permanent; every action is recorded and reversible. Source
Git is a Version Control System (VCS). On a very basic level, there are two awesome things a VCS allows you to do: You can track changes in your files, and it simplifies working on files and projects with multiple people.

Now lets' focus on the main question.

Why should I learn Git?

1. Git is simple and easy to learn

I think it takes about fifteen to thirty minutes to learn Git. You could look for tutorials on YouTube. You could watch this fifteen minutes video and also download this cheatsheet. Those are two very useful resources.

2. Version control

With git whenever you get issues or bugs or you just don't know what you're doing anymore 😅 (happens to a lot of us), you could revert back to like three-months-ago and reassess your strategy. Git will remember every change.

3. Teamwork

Git simplifies the process of working with teams. Team members can work on files and merge them with the master branch. It allows multiple people to work on the same file at the same time.

4. You would not forget what you wrote

With Git you could abandon a project for like four months (which you shouldn't) and later come back to it and you wouldn't be asking questions like:

Who wrote this ?!

because you read through commits to help you remember what each change in the file was for.

Also, I realized that a lot of Code Newbies say thing's like they'll look into Git later in their coding career. That's the wrong way of thinking. If you've already learned to code just know that it's never too late to learn Git but I actually feel like you should learn Git before you start coding. Just know that Git is not only for programmers.

If this article has convinced you to learn Git, click here to learn it. Your Welcome

Check out my Twitter or Instagram.

GOOD LUCK 👍

Logistic Regression with Scikit-learn

user — Fri, 18 Sep 2020 19:21:32 +0000

We'll start with the questions on your minds right now.

What is Logistic Regression?

Logistic Regression is a Machine Learning classification algorithm that is used to predict the probability of a categorical dependent variable.
In logistic regression, the dependent variable is a binary variable that contains data coded as 1 (yes, success, etc.) or 0 (no, failure, etc.).

It looks like this:

Image

What you should depict from this image is that in logistic regression, your data is classified into 0 or 1.

If you've been following up with the series. Just know that this is a special one because today you're gonna do the Feature Extraction by yourself.

The question that's probably on your mind if you've not been following up with the series:

What is Feature Extraction?

Feature extraction is a process of dimensionality reduction by which an initial set of raw data is reduced to more manageable groups for processing.
In other terms, it is the act of selecting useful features from a dataset and dumping the rest.

Click here to download the dataset we're gonna be using today. Normally, once you click on the link it starts downloading but as I said this article is different. Since you're doing the Feature Extraction yourself, you'll have to know which feature's you're gonna select. This means that you'll have to study the attribute information yourself.

GOAL OF THE DAY:

We're gonna make a model that would be able to predict if someone has heart disease or doesn't.

We're gonna start coding now

Importing the needed libraries

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

Load and view the dataset

df = pd.read_csv('heart.csv')
df.head()

OUTPUT:

Image

Feature Extraction

This is where you do your research check which features are important.

Making the training and validation set

When a large amount of data is at hand, a set of samples can be set aside to evaluate the final model. The "training" data set is the general term for the samples used to create the model, while the “test” or “validation” data set is used to qualify performance.

train_data, validation_data, train_labels, validation_labels = train_test_split(
data,
labels,
train_size=0.8,
test_size=0.2,
random_state=1)

train_size is how big or small you want your training set to be. This is the same for test_size.
random_state is basically used for reproducing your problem the same every time it is run.

Making a model

model = LogisticRegression()
model.fit(train_data,train_labels)
print(model.score(validation_data,validation_labels))

OUTPUT:

0.7704918032786885

The score is not too bad but it's not good.

Making predictions with your model

Now it's time to make a prediction, using the features that you've picked.

print(model.predict([[63,1,4,141,233,1,1,150,0,2.3,0,0,1]]))

OUTPUT:

[1]

You can visit Kaggle to find more datasets that you can perform Logistic Regression on.

Check out my Twitter or Instagram.

Feel free to ask questions in the comments.

GOOD LUCK 👍

K-Nearest Neighbors with Scikit-learn

user — Fri, 18 Sep 2020 19:18:19 +0000

Before we start talking about K-Nearest Neighbors, I'm going to list other common classification algorithms in Machine Learning:

Logistic regression
Support Vector Machines
Decision trees
Random forests
Naive Bayes classifier

Now I'm gonna focus on the questions that are probably in your head right now.

What is the K-Nearest Neighbors algorithm?

This algorithm is used to solve the classification model problems. K-nearest neighbor or K-NN algorithm creates an imaginary boundary to classify the data. When new data points come in, the algorithm will try to predict that to the nearest of the boundary line.

It could look like this:

Image

From this image, you would be able to depict that:

When k=3 the new data point(the star) introduced is going to be classified into Class B because there are more Class B data points in the imaginary boundary.
When k=6 the new data point(the star) introduced is going to be classified into Class A because there are more Class A data points in the imaginary boundary.

Before we start coding you'll need to install the dataset we're gonna use. Click here to install the dataset we're gonna use. Open the file named Breast_cancer_data.csv. You should see something like this:

Image

GOAL OF THE DAY

We're gonna make a classification model that would be able to predict whether a breast is cancerous or not.

We're gonna start coding now.

Importing the needed libraries

import pandas as pd
import numpy as np
from sklearn.neighbors import KNeighborsClassifier

Load and view the dataset

df = pd.read_csv('Breast_cancer_data.csv')
df.head()

OUTPUT

Image

Feature Extraction

data = df[["mean_radius", "mean_texture", "mean_perimeter", "mean_area", "mean_smoothness"]]
data = data.values.reshape(-1,5)
labels = df["diagnosis"]

Making a classification model

classifier = KNeighborsClassifier(n_neighbors=100)
classifier.fit(data, labels)
print(classifier.score(data, labels))

Just know that n_neighbors represents k

OUTPUT

0.8945518453427065

Making predictions with your model

print(classifier.predict([[7.76,24.54,47.92,181.0,0.05263]]))

OUTPUT

[1]

This shows that this is a cancerous breast but who knows, our model's prediction might be wrong. Just know that even if your model has a high score some of its predictions might still be wrong.

You can visit Kaggle to find more datasets that you can perform Classification with K-Nearest Neighbors.

Check out my Twitter or Instagram.

Feel free to ask questions in the comments.

GOOD LUCK 👍

Linear Regression with Scikit-learn (Part 2)

user — Fri, 18 Sep 2020 19:15:04 +0000

This is the second part and here we would be talking about Multiple Linear Regression.

Questions on all your minds:

What is Multiple Linear Regression?

It is a statistical technique that uses several explanatory variables to predict the outcome of a response variable.
Multiple Linear Regression is used to estimate the relationship between two or more independent variables and one dependent variable.

With Multiple Linear Regression(MLR), you can predict the price of a car, house, and more.

Before we start coding you'll need to install the dataset we're gonna use. Click here to download the dataset we're gonna use. Open the file named 50_Startups.csv. You should see something like this:

Image

GOAL OF THE DAY

We're going to make a regression model that would be able to predict the profit of Startups.

We're gonna start coding now.

Importing the libraries

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

Load and view dataset

df = pd.read_csv('50_Startups.csv')
df.head()

OUTPUT

Image

Feature Extraction

What is Feature Extraction?

Feature extraction is a process of dimensionality reduction by which an initial set of raw data is reduced to more manageable groups for processing.
In other terms, it is the act of selecting useful features from a dataset and dumping the rest.

data = df[['R&D Spend', 'Administration', 'Marketing Spend']]
data = data.values.reshape(-1,3)
labels = df[['Profit']]

As you can see, I did not select the State column to be part of the data. The reason being that it is not really necessary, and any unnecessary data would decrease the chances of your model accuracy being high.

Making a Regression Model

model = LinearRegression()
model.fit(data,labels)
print(model.score(data,labels))

OUTPUT

0.9507459940683246

TAKE NOTE: The closer the accuracy is to 1.0 the better it is. It increases the chances of your model's prediction being true.

Making predictions with your model

print(model.predict([[165349.20, 136897.80, 471784.10]]))
print(model.predict([[144372.41, 118671.85, 383199.62]]))

OUTPUT

[[192521.25289008]]
[[173696.70002553]]

That's how simple it is. What you've done now is that you've predicted the profit of a Startup from some of their expenses.

You can visit Kaggle to find more datasets that you can perform Linear Regression on.

Check out my Twitter or Instagram.

Feel free to ask questions in the comments.

GOOD LUCK 👍

Linear Regression with Scikit-learn (Part 1)

user — Mon, 14 Sep 2020 10:55:47 +0000

First off let's start with the questions on your mind:

What is Scikit-learn?

Scikit-learn is a Python framework for machine learning. It features various algorithms like support vector machines, random forests, and k-neighbors, which you are going to learn here.

What is Linear Regression?

A statistical way of measuring the relationship between variables. Just know that with Linear Regression, you can predict the future.

There are two types of Linear Regression:

Simple Linear Regression
Multiple Linear Regression

Just know that Multiple Linear Regression is an extension of Simple Linear Regression. It is used when we want to predict the value of a variable based on the value of two or more other variables.

That's enough information for now. We're gonna start coding.

This first article is for Simple Linear Regression the second part is for Multiple Linear Regression.

We have to install the following libraries using pip:

pip install pandas
pip install numpy
pip install sklearn

Click here to install the dataset we're gonna use. Then extract the Salary_Data.csv file inside it.

You should see a .csv file like this:

   YearsExperience   Salary
0              1.1  39343.0
1              1.3  46205.0
2              1.5  37731.0
3              2.0  43525.0
4              2.2  39891.0

The data explanation:
As you can see there is a column called YearsExperience. This is the feature. In ML a feature is an individual measurable property or characteristic of a phenomenon being observed.
Also
there is a column called Salary. This is the Label. In ML a label is the thing we're predicting. It's the y variable in Simple Linear Regression.

Open your Code Editor and make a new Python file called: linear_regression.py or you could open a Jupyter Notebook.

Importing the needed libraries

import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

We use the as keyword to give the imported module an alias to make our code shorter.

Load and view dataset

df = pd.read_csv('Salary_Data.csv')
print(df.head())

OUTPUT:

   YearsExperience   Salary
0              1.1  39343.0
1              1.3  46205.0
2              1.5  37731.0
3              2.0  43525.0
4              2.2  39891.0

Feature Extraction

x = df['YearsExperience']
x = x.values.reshape(-1, 1)
y = df['Salary']

Making a regression model

model = LinearRegression()
model.fit(x,y)
print(model.score(x,y))

Just know that the last line print(model.score(x,y)) is done to check how accurate your model is.
Below is the output of the print() statement above. The .score() function is used to get the accuracy of your model.

OUTPUT

0.9569566641435086

The closer it is to 1 the more accurate it is.

Making predictions with your model

print(model.predict([[3]]))
print(model.predict([[4]]))
print(model.predict([[5]]))

OUTPUT

[54142.08716303]
[63592.04948449]
[73042.01180594]

That's how simple it is. What you've done now is that you've predicted the salary of a person from their years of experience

You can visit Kaggle to find more datasets that you can perform Linear Regression on.

Feel free to ask questions.
GOOD LUCK 👍