I wrote a program a few years ago that used classification and clustering algorithms to generate blog posts (mainly for use in dark SEO arts that were common at the time). My goal was to make it more powerful than the garbage text that was put together by Markov chain generators. The results were mixed depending on the blog topic. The results would read normal enough at first but would sometimes get really weird. What did surprise me was that in a number of cases blog comments from real people were posted suggesting that I learn how to speak English.
What I found is that short blurbs and titles were easy but building larger articles (500+ words) because it needed a large dictionary to draw from and building the dictionary was very tedious and time consuming. The final dictionary was around 10GB. Maybe one day I'll revisit it and see if I can improve upon the backend dictionary building part.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
I wrote a program a few years ago that used classification and clustering algorithms to generate blog posts (mainly for use in dark SEO arts that were common at the time). My goal was to make it more powerful than the garbage text that was put together by Markov chain generators. The results were mixed depending on the blog topic. The results would read normal enough at first but would sometimes get really weird. What did surprise me was that in a number of cases blog comments from real people were posted suggesting that I learn how to speak English.
What I found is that short blurbs and titles were easy but building larger articles (500+ words) because it needed a large dictionary to draw from and building the dictionary was very tedious and time consuming. The final dictionary was around 10GB. Maybe one day I'll revisit it and see if I can improve upon the backend dictionary building part.