Rohan Sawant

Posted on May 10, 2020

I made an AI Chrome Extension to fight Fake News! - Bunyip

#showdev #opensource #webdev #javascript

Bunyip is a Chrome Extension, which allows us to detect AI-generated text, it helps users detect fake news articles which might be generated automatically and not by a real human!

You can install the extension from Chrome Store! - Bunyip - Detect all the Glitter in the Wild

As with several of my older projects the aim with Bunyip, initially was to monetize it.

CT83 / Bunyip

Bunyip is a Chrome Extension, which allows us to detect AI generated text, it helps users detect fake news articles which might be generated automatically and not by a real human!

You can install the extension from Chrome Store! - Bunyip - Detect all the Glitter in the Wild

As with several of my older projects the aim with Bunyip, initially was to monetize it.

Buut, I could not find a market for this, and for now I decided against making it a paid extension on Chrome Store so I Open Sourced it instead.

Working

The selected text is sent to a Serverless Function for classification
The response contains the words with the likelihood of each word being generated by an AI.
The extension then visualizes these words, using different words to correspond to different probabilities.

This project builds on the Giant Language Model Test Room, which enables a…

View on GitHub

Buut, I could not find a market for this, and for now I decided against making it a paid extension on Chrome Store so I Open Sourced it instead.

Working

The selected text is sent to a Serverless Function for classification
The response contains the words with the likelihood of each word being generated by an AI.
The extension then visualizes these words, using different words to correspond to different probabilities.

This project builds on the Giant Language Model Test Room, which enables a forensic analysis of how likely an automatic system has generated a text. - Live Demo

Components

There are about 3 components that makeup Bunyip.

1) Bunyip - Chrome Extension

This simply sends the selected text to the GCP Cloud Function Proxy which then forwards it to GLTR.

2) Serverless Proxy running on Google Cloud Platform

The Algorithmia REST call contains an API Key that is required to make a request to it, so and the only way I could think of to keep it not hardcoded in the Chrome Extension was to use a proxy hence the workaround.

3) Modified Version of GLTR - A tool to detect automatically generated text

This is deployed on Alogrithmia's Serverless Environment and is interacted with - through a REST API, the GCP Function makes a call to this internally and returns the response to the Chrome Extension.

How did I go about making it?

Step 1 - Analyzing the problem statement at hand

To create a Chrome Extension to detect if the selected text was generated by an AI

I created a list of all the things I needed to learn, Chrome Extensions, Serverless Deployment, GCP Cloud Functions, the GLTR Integration.

Step 2 - Getting GLTR up and running locally

This was way easier than I thought it would be, everything worked in a jiffy - installed requirements and started the flask server, used PostMan to test everything locally.

Step 3 - Creating the Chrome Extension

This was the easiest but the most time-consuming part of the process, the UI took longer than I expected to make but the results were impressive!

Step 4 - Deploying the Flask App onto a Serverless Cloud Platform

This was super tricky and I touch more on this in the Challenges section.

Step 5 - Publishing the Extension to Chrome Webstore

Documentation on how to do this was pretty clear, so I was able to power through this.

Challenges

Deployment is always a doozy

Yes, one of the most understated parts of building Bunyip was the overwhelming amounts of extra work that needed to be done to make run in the wild and not just on my laptop. Deploying the entire setup somewhere cheap and scalable was the major challenge.

1. Models cannot be directly deployed onto Serverless functions

I had assumed that I would just be able to directly deploy my entire app to some Serverless Environment and everything would be a breeze, well...

Turns out the PyTorch package which is needed to run the model was over 500 MB big, which meant it was too big for AWS Lambda Functions and GCP Cloud Functions to handle.

Then, I thought about deploying the Flask App to AWS EC2 instances instead.

But, I noticed how I was going to need at least a t2.large instance and it was more than what I wanted to spend on a side project.

Then I stumbled across Algorithmia, which allows you to wrap your Python code in a REST complete with Authentication, Hosting, Logging, client-side libraries for all major languages and so much more!

With a little bit of refactoring and after a few tries I was able to get by App on it. The next step was simply making POST calls to it from my Chrome Extension.

2. Accessing the Algorithmia API without hardcoding the API Keys in the Chrome Extension

Algorithmia requires you to include an API Key every time you make a request to it, traditionally this would mean, the Bunyip - Chrome Extension would have to do do this. But, I didn't think it was wise to just expose my credentials to all of the internets!

The way I went around this was I created a simple proxy function and deployed it as a GCP Cloud Function, the proxy made authenticated calls on behalf of the browser and returned the appropriate responses, this meant, my API Keys were totally private and secure.

Motivation

Andrej Karpathy tweeted this, and I thought, "Yes! That something which I could actually do!".

So I did!

References

This project builds on the strong foundation provided by the Giant Language Model Test Room built by Hendrik Strobelt, Sebastian Gehrmann and Alexander M. Rush. GLTR, enables a forensic analysis of how likely an automatic system has generated a text.
You can find the GLTR instance deployed as an API on Algorithmia - bunyip-gpt-detector
You can find OpenAI's original GPT Detector, deployed as an API here - gpt-detector

Credits

Bunyip would never have become possible without the works of @hen_str, @S_Gehrmann, and @harvardnlp on the Giant Language Model Test Room, they even went out of their way to aid me on Twitter when I hit a few roadblocks!

Go follow them, now!

Top comments (1)

nickveliki • May 10 '20

Noice! Fight AI generated text with AI analysis