DEV Community

Discussion on: Generating Random Cure Song Titles with Markov Chain

Collapse
 
jfrankcarr profile image
Frank Carr

I wrote a program a few years ago that used classification and clustering algorithms to generate blog posts (mainly for use in dark SEO arts that were common at the time). My goal was to make it more powerful than the garbage text that was put together by Markov chain generators. The results were mixed depending on the blog topic. The results would read normal enough at first but would sometimes get really weird. What did surprise me was that in a number of cases blog comments from real people were posted suggesting that I learn how to speak English.

What I found is that short blurbs and titles were easy but building larger articles (500+ words) because it needed a large dictionary to draw from and building the dictionary was very tedious and time consuming. The final dictionary was around 10GB. Maybe one day I'll revisit it and see if I can improve upon the backend dictionary building part.