Fake News Detection
Fake news is one of the biggest problems in the world because it leads to a lot of misinformation. Most of these false news may be about about a community’s political and religious beliefs and can lead to riots and violence as you must have seen in the country where you live. So, to detect fake news, we can find relationships between the fake news headlines so that we can train a machine learning model that can tell us whether a particular piece of information is fake or real by simply observing the headline.
Fake News Detection using Python
My dataset for the fake news detection task has data about the news title, news content, and a column known as label that shows whether the news is fake or real. So we can use this dataset to find relationships between fake and real news headlines to understand what type of headlines are in most fake news. Below is my dataset.
DATASET
> #we're loading data into our notebook
> import numpy as np
> import pandas as pd
> import matplotlib.pyplot as plt
> import warnings
#importing our data through pandas
#importing our data through pandas
data=pd.read_csv('/content/drive/MyDrive/news.csv')
#we check how our data looks like
data
#we do data analysis and preprocessing
data.describe()
#To check the total number of rows in our data
data.count()
#To check whether our data have missing values
data.isnull().sum()
#we convert label into 1 and 0
data['label'] = data['label'].apply(lambda x: 1 if x == 'REAL' else 0)
#we convert our data from categorigal to numerical
dummies=pd.get_dummies(data[['title','text']])
#we split our data into x and y
x=data.drop('label',axis=1)
y=data['label']
#lets import the model
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
np.random.seed(42)
model=RandomForestClassifier()
x_train,x_test,y_train,y_test=train_test_split(dummies,
y,
test_size=0.2)
model.fit(x_train,y_train)
model.score(x_test,y_test).
Top comments (0)