Compendium a Abstractive Text Summarization using Attention Mechanism

Ratnesh Kumar — Mon, 25 May 2020 00:52:59 +0000

My Final Project

“Don’t need the full article or text, just require the summary.” I often think while reading an article or a newspaper. During my under graduation I often face this kind of situation. It's such a drag to read the full report if you can get the compendium. Right?

During my under-graduation I thought to automate it. Doing it manually takes a lot of time. So, using the magnificent powers of NLP and Deep learning, I came up with the project Compendium. An abstractive text summarization model, that takes a text and provides its summary.

Link to Code

Have an Idea?, want to contribute or want to see code behind the magic. The code can be found in the following GitHub repository.

soni-ratnesh / compendium

Generates summary of a given news article. Used attention seq2seq encoder decoder model.

Compendium

Introduction

Compendium is a seq2seq abstractive text symmetrization model based on GRU encoder decoder with attention mechanism.

Base Info

Files and there uses are listed below

Data -> Folder used for train and test data storage
helper -> Contains helper functions used in model.ipynb
brain -> Contains trained RNN Model
Data Clean.ipnb -> Ipython Notbook for cleaning and splitting data into train, val and test
model.ipynb -> Ipython Notebook for training andd testing model.
requirement -> txt file containg required lib

Requirements

Python : 3.X

Pip

Installation

Steps for installation?

Download venv pip install virtualenv
Clone git clone https://github.com/soni-ratnesh/compendium.git
Change directory cd compendium
Create virtual environment pip venv env
Activate virtual env . env\bin\activate
Install required library pip install -r requirements.txt
Run Jupyter notebook jupyter notebook

Result

The testing accuracy and loss are,

Test Loss     :  2.23
Test PPE      :  10.87

Need trained model?

We do provide trained model…

View on GitHub

How I built it

The project was built using Pytorch, Spacy and torchtext. Pytorch is used to build the summarization model, while Spacy and torchtext was used to pre-process data.
The summarization model is an Encoder decoder model with GRU cell. Why GRU not LSTM? The choice was made after experimentation. Tried both of them but GRU performed slightly better so GRU it is.
Choosing the data on which to train on was one of the most important decision. As the data is the key factor for this. It's always Garbage In Garbage Out. For better Generalization we used 15000+ News article dataset from Kaggle for training.
After a few different approaches, finally arrived at Attention Mechanism. Challenging our self to learn and implement it was one of the greatest event.
GitHub Education Pack helped to get a domain for publishing work.

Additional Thoughts / Feelings / Stories

I am really proud of the model. I further want to improve it and make it better and more general.
At last just want to say every locked door has a key you just have to know where to find it.

Thank you for reading and Happy Coding!!

DEV Community: Ratnesh Kumar