Before you go any further, please note that this blog post contains absolutely nothing of value. This was a stupid idea I had last night that I decided to quickly build this morning. It worked. It made me laugh. But there is nothing of value here. If your boss catches you reading this you'll probably be fired. You've been warned.
So - a Markov chain is - in my understanding - a way of determining what value would come after another based on a set of initial input. So given a set of data, let's say words, you can determine which word is most likely to come after another. You can find a great example of this to generate realistic Lifetime movie titles: "Using Javascript and Markov Chains to Generate Text". Unfortunately the code samples in the blog are broken, but the examples are funny as hell.
I did a quick search and found a great npm library that simplifies creating demos like this: titlegen. From the docs, here is a sample of how easy it is use:
var generator = titlegen.create();
generator.feed([
'You Only Live Twice',
'From Russia with Love',
'The Man with the Golden Gun',
'Live and Let Die',
'Die Another Day'
]);
console.log(generator.next()); // -> "From Russia with the Golden Gun"
console.log(generator.next()); // -> "You Only Live and Let Die Another Day"
console.log(generator.next()); // -> "The Man with Love"
Pretty cool, right? So I thought - what if I tried this with Cure songs? I scraped the content from Wikipedia, did a bit of cleanup, and created this demo:
https://cfjedimaster.github.io/webdemos/generateCure/titlegen.html
If you don't want to click, here are some examples:
The demo is a stupid simple Vue app. The layout is just a few tags so I'll skip it, but here is the JavaScript. Note I've removed most of the Cure titles to keep it shorter:
// source: https://en.wikipedia.org/wiki/Category:The_Cure_songs
let input = `10:15 Saturday Night
The 13th
Accuracy
LOTS OF STUFF REMOVED
The Walk
Why Can't I Be You?
Wrong Number`;
input = input.split('\n');
var generator = titlegen.create();
generator.feed(input);
const app = new Vue({
el:'#app',
data() {
return {
title:""
}
},
created() {
this.newTitle();
},
methods: {
newTitle() {
console.log('generating cureness');
this.title = generator.next();
}
}
});
I don't think I understand even 1% of the math behind this and I don't know how realistic this is but my God did it bring it a smile to my face. If you want to look at all the code, you can find it here: https://github.com/cfjedimaster/webdemos/tree/master/generateCure
Oh, and finally, you can test a Depeche Mode version here: https://cfjedimaster.github.io/webdemos/generateDepecheMode/titlegen.html
Top comments (7)
I was looking for Markov chains yet you just use some library. Your post title was rather misleading.
Sorry you didn't like it - I had thought from the title that was it obvious that I was just having a bit of fun. Hopefully the link PNS11 shared will be helpful.
After reading the title, I really thought that I would learn how you used the concept of Markov chains to generate titles.
Anyway, I just wanted to give you feedback. You see title marketing/link baiting on so many websites nowadays that people got used to it. But it annoys me EVERY time because it feels like being cheated.
While it is just another library I found crap.l to be readable and easy to learn about basic Markov chaining from, picolisp.com/wiki/?ticker .
Implementation usually goes something like this. The math behind it is basically a frequence calculation of how common it is for each word to be followed by another, then when you build your strings you use this probability estimate to choose what words to chain to one you picked (pseudo-) randomly as a kind of seed.
(googlebot seems to have kept up the indexing the ticker since the writing of that article, today it has crawled and cached thousands of pages)
Very nice, I think I prefer Depeche Mode...
"I Sometimes Wish I Sometimes Wish I Sometimes Wish I Sometimes Wish I Was Dead" :)
It would be pretty neat to bring this a step further and somehow let it generate songs given any band name. There must be some API that lets you get a list of songs given some band name...
EDIT: Yep! It's possible! developer.musicgraph.com/api-docs/...
I wrote a program a few years ago that used classification and clustering algorithms to generate blog posts (mainly for use in dark SEO arts that were common at the time). My goal was to make it more powerful than the garbage text that was put together by Markov chain generators. The results were mixed depending on the blog topic. The results would read normal enough at first but would sometimes get really weird. What did surprise me was that in a number of cases blog comments from real people were posted suggesting that I learn how to speak English.
What I found is that short blurbs and titles were easy but building larger articles (500+ words) because it needed a large dictionary to draw from and building the dictionary was very tedious and time consuming. The final dictionary was around 10GB. Maybe one day I'll revisit it and see if I can improve upon the backend dictionary building part.