DEV Community

KamauRuth
KamauRuth

Posted on

FAKE NEWS DETECTION.

Fake News Detection
Fake news is one of the biggest problems in the world because it leads to a lot of misinformation. Most of these false news may be about about a community’s political and religious beliefs and can lead to riots and violence as you must have seen in the country where you live. So, to detect fake news, we can find relationships between the fake news headlines so that we can train a machine learning model that can tell us whether a particular piece of information is fake or real by simply observing the headline.

Fake News Detection using Python
My dataset for the fake news detection task has data about the news title, news content, and a column known as label that shows whether the news is fake or real. So we can use this dataset to find relationships between fake and real news headlines to understand what type of headlines are in most fake news. Below is my dataset.

DATASET

> #we're loading data into our notebook
> import numpy as np
> import pandas as pd
> import matplotlib.pyplot as plt
> import warnings
Enter fullscreen mode Exit fullscreen mode
#importing our data through pandas
#importing our data through pandas
data=pd.read_csv('/content/drive/MyDrive/news.csv')
Enter fullscreen mode Exit fullscreen mode
#we check how our data looks like
data
Enter fullscreen mode Exit fullscreen mode
#we do data analysis and preprocessing
data.describe()

Enter fullscreen mode Exit fullscreen mode

#To check the total number of rows in our data
data.count()
Enter fullscreen mode Exit fullscreen mode
#To check whether our data have missing values
data.isnull().sum()
Enter fullscreen mode Exit fullscreen mode
#we convert label into 1 and 0
data['label'] = data['label'].apply(lambda x: 1 if x == 'REAL' else 0)
Enter fullscreen mode Exit fullscreen mode
#we convert our data from categorigal to numerical
dummies=pd.get_dummies(data[['title','text']])
Enter fullscreen mode Exit fullscreen mode
#we split our data into x and y
x=data.drop('label',axis=1)
y=data['label']
Enter fullscreen mode Exit fullscreen mode
#lets import the model
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
Enter fullscreen mode Exit fullscreen mode
np.random.seed(42)
model=RandomForestClassifier()
x_train,x_test,y_train,y_test=train_test_split(dummies,
                                              y,
                                              test_size=0.2)
model.fit(x_train,y_train)
Enter fullscreen mode Exit fullscreen mode
model.score(x_test,y_test).
Enter fullscreen mode Exit fullscreen mode

Top comments (0)