DEV Community

loading...

My first AI post using Pre-trained GPT-3

Daniel Dominguez
Software Product Manager @SamXLabs | Editorial Team for AI/ML/Data @InfoQ
・1 min read

This is my first post trying the Pre-trained GPT-3 as the auto regressive language model that uses deep learning to produce human-like text. Below the post result which was automatically wrote, it should be about how to generate a GPT-3 on it's own:

The Pre-trained GPT-3
The full GPT-3 train/test data set was sourced from John Baez's blog. All the tweets have been tagged with a normalized date of 23 February 2014. The tags have been extracted using the Python package impacket. The data set contains: 4301 tweets 633,583 IDs 401,829 words (full list here) 2.6 million words (just the full text of tweets). The ID's were extracted using the Python package twitter utils and the tweets were classified by K-Means clustering. The K-Means clustering was again done using impacket. The load time is roughly 500 ms. The network architecture is pretty straight-forward. I only trained the networks for about 1 hour on my desktop. The tweets are now directly in Tensorflow and there's no more manual manipulation needed.

Conclusion
This is just a taste of some results I wanted to share. The current version is nowhere near human. I’ll try to write about more deep learning and auto-regressive models for text related problems.

Discussion (0)