Building a Deep Learning Model to identify unreliable news articles
What is Fake news?
Fake news is false or misleading information presented as news. It often aims to damage the reputation of a person or entity or make money through advertising revenue.
However, the term does not have a fixed definition and has been applied more broadly to include any type of false information, including unintentional and unconscious mechanisms, and also by high-profile individuals to apply to any news unfavorable to his/her personal perspectives.
Aim
To develop a Fake News Classifier using Bidirectional Long Short Term Memory (LSTM) using Python programming Language and Keras on Cainvas Platform.
Prerequisites
Before getting started, you should have a good understanding of:
- Python programming language
- Keras — Deep learning library
Dataset
we are going to use the train.csv dataset to train the model and then we do predictions for the test.csv dataset.
you can download these CSV files from Kaggle:
URL: https://www.kaggle.com/c/fake-news/data
Importing all the required libraries
let’s import all the required libraries:
Load and Process Data
Let’s load our data file train.csv using pandas.
Output:
drop the nan values:
load X and y with Independent and dependent features:
One-hot Representation:
Vocabulary size:
Getting a copy of Independent features:
Downloading stopwords:
we are using nltk’s stopwords method to remove stopwords from our data, NumPy for array operations, and pandas to process data.
Dataset Preprocessing:
output:
output:
Embedding Representation:
refer to: https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
output:
Building the model:
output:
train test split:
here we use sklearn.model_selection package to split the data into train data and test data
Training Model:
output:
Predicting and Heat Map:
output:
Accuracy of the Model:
output:
output:
Loading the test data:
output:
Making Predictions for test data:
Joining the test data and predicted labels:
output:
URL to access the Notebook: https://cainvas.ai-tech.systems/use-cases/fake-news-classification-app-using-lstm/
Conclusion
We’ve trained our simple Bidirectional LSTM model on a fake news dataset and got an accuracy of 90%. There are many other machine learning models which perform much better but let’s admit it Machine Learning models require a lot of feature engineering and data wrangling. We are using a deep learning model to let the model figure everything out on its own.
Credit: Om Chaithanya V
Also Read: Malaria Parasite Detection using a Convolutional Neural Network on the Cainvas Platform