Meeting Minutes

#octograd2020 #githubsdp

Abstractive Summarization and Entity Extraction from Minutes of a Meeting

Developed an end-to-end solution that facilitates the generation of minutes of a meeting using audio or scanned text. Used the BART transformer model finetuned on the AMI Meeting's Corpus to achieve great ROUGE evaluation scores. Used a Bi-LSTM CRF model for Named Entity Recognition to identify names, places, dates, time, etc. Implemented a speaker diarization system using a VAD system and CNN. Developed an intuitive Flutter application to serve these modules, all hosted on Google Cloud Platform and Digital Ocean.

Demo Link

Video Demo

Link to Code

The code repositories have to be kept private until the end of the semester in May, but this is a test implementation repository that we used for our initial research.
GitHub Repo

How We built it

These were the tools and technologies we used to develop this project

Summarization
- PyTorch Transformers
- PyTorch Lightning
- Google Cloud VM (Model Training)
Named Entity Recognition
- PyTorch Transformers
Speech Recognition and Speaker Diarization
- Google Speech API
- PyTorch
Backend
- Node.js
- MongoDB (MongoDB Cloud Atlas - GitHub Education Pack)
- JSONWebTokens
- Digital Ocean (Deployment) (GitHub Education Pack)
Mobile Application
- Flutter
- Firebase ML Vision

Also, we used a custom domain with the Digital Ocean server, obtained from NameCheap (GitHub Education Pack).

Additional Thoughts / Feelings / Stories

This application was actually conceived during a Students' Council meeting where one of us had to always write down stuff and then dictate before the next meeting which especially in a college environment feels extremely time-consuming given that we already have limited time juggling between studies and council activities.

DEV Community