The "Commonsense" Problem In Computer-Generated Works
Tariq Ali Aug 26, 2016
Previous Text Generation Articles:
"It's ... worth noting, I think, that many successful machine-made works, like the music of EMI or Cope's subsequent machine creativity project, Emily Howell, or the screenplay for Sunspring, rely heavily on interpretation by humans, making the machine prominence of the source a novelty which excuses the search for meaning instead of encouraging it.
By which I mean, it's difficult to say what Benjamin, the AI that wrote Sunspring, could possibly be getting at, because Benjamin is an artificial intelligence. The common sense read is that Benjamin is getting nothing."--------Mike Rugnetta, Can an Artificial Intelligence Create Art?
Machine capabilities are constantly increasing in a variety of different fields. First limited to industrial automation, machines are now being used for 'creative' enterprises as well: music generation, painting generation, text generation, etc. However, while a machine is able to generate creative works, it does not understand what it is generating. From its perspective, it's simply manipulating symbols based on its external input and its internal programming. Machines lack the commonsense knowledge that we take for granted. Researchers are attempting...and failing...to implement commonsense knowledge in AI.
And the lack of commonsense knowledge could serve as a barrier to full acceptance of computer-generated works. In fact, I once wrote a blog post entitled Why Robots Will Not (Fully) Replace Human Writers arguing why algorithms will "only" write the majority of all literature, instead of displacing human authors entirely:
Humans take for granted their ability to perceive the world. Their five senses gives a continual stream of data that humans are able to quickly process. Bots, on other hand, are only limited to the "raw data" that we give them to process. They will not "see" anything that is not in the dataset. As a result, how the bots understand our world will be very foreign to our own (human) understanding.
... Humans will likely tolerate the rise of automation in literature, and accept it. Bots may even write the majority of all literature by 2100. But there will still be some marginal demand for human writers, simply because humans can relate more to the "worldview" of other humans. These human writers must learn how to coexist with their robotic brethren though.
However, implementing commonsense knowledge is not necessary for successful text generation. Unlike Mike Rugnetta, I believe that there are three approaches that can allow us to successfully search for meaning within computer-generated texts, even when the computers fail to understand or appreciate that meaning. Each approach carry with it is own drawbacks however, and the programmer must decide what trade-offs to make.
Computers and humans speak different languages. While we speak in natural languages, computers only understand programming languages. But what if we translate our ideas into a series of hard-coded rules based on real-life? Then the computer can read those rules and use them as a basis by which it can then generate coherent and meaningful text. The meaning, after all, comes from the hard-coded rules that the computer is simply executing.
The computer-science term for "a bunch of hardcoded rules" is the "world model", and when programmers started research into text generation, they immediately started using world models. In the 1960s, they built SAGA II, a computer program that can generate scripts about a gunfight between a cop and a robber, by writing rules on how the cop and the robber will behave when facing each other. In the 1970s and onwards, they built many story generation algorithms using world models. Some algorithms tried to simulate the behavior of different characters like SAGA II, while other algorithms attempted to simulate the behavior of the author's mind in developing the story and deciding what characters do in it. A few algorithms even implemented a form of 'self-evaluation' of the generated content, allowing the machine to 'revise' the story if it doesn't meet certain critera.
World models have also been used outside of academia. Liza Daly wrote about their current use in both video games and text generation, since they are good at providing coherence. Liza mentioned NaNoGenMo algorithms such as Teens Wander Around The House and The Seeker as examples of world models representing characters, while I may also mention A Time For Destiny as a world model representing the "author's mind".
According to Liza Daly, world models temporarily fell out of favor during the 1980s because of scaling issues. It just takes too much time for a human to write out these rules, and it takes too long for a computer to understand them using 1980s technology. Our computers are faster, and we may be better at abstracting away the rule-making details, so maybe this time is different.
However, the main drawback to using world models is that they limit the creative potential of the machinery. While hardcoded rules ensure that the generated works has some logical sense behind them (provided that the rules aren't buggy), it also excludes any possibility of interesting creativity. The generated works are sensible, but dull to read. Even the output of a Markov chain can shock you...sometimes. But the output of a world model may be too conventional and predictable. You reduce the risk of generating utter nonsense, but also you reduce the odds of generating something interesting too.
It would be crazy to say that romance novels compete against historical fiction novels, or that people will give up reading science fiction once we learn how to mass-produce murder mysteries. Genres exist within literature, and they necessarily exist because human beings have different tastes and desires (though works can easily belong to multiple genres...a historical fiction romance novel, for instance). It makes sense therefore that computer-generated literature could exist as its own separate genre, adhering to its own unique conventions and appealing to a certain, niche audience.
The audience of computer-generated works may come from the fanbase of procedural generation, a computional approach used in video games to produce content. One video game developer, Bruno Dias, talked about procedural generation, in an interview for his own game-in-development, Voyageur:
In games, procgen originated as a workaround for technical limitations, allowing games like Elite to have huge galaxies that never actually had to exist in the limited memory of a 90’s computer. But it quickly became an engine of surprise and replayability; roguelikes wouldn’t be what they are if the dungeon wasn’t different each time, full of uncertainty. Voyageur represents an entry into what we could call the “third generation” of procgen in games: procedural generation as an aesthetic."
Voyageur is a "space exploration" game, which uses procedural generation to generate the various textual descriptions of the planets that a player can travel to. Bruno Dias stated that the goal of the procedural generation in his game is to "explor[e] the tenor and meaning of procedural prose". If people like Bruno's procedurally-generated descriptions, they may be receptive to future works that embraces this 'aesthetic'.
NaNoGenMo also seems to represent the ethos of "procedural generation as an aesthetic", with the various programmers interested in using their algorithms to express ideas in new and interesting ways. One of the many news article about NaNoGenMo compared the yearly competitions to Dadaism, and wrote how one competitor also saw influences of "Burrough’s cut-up techniques" and "constraint-oriented works of Oulipo".
The author (Kathryn Hume) even gone further in its defense of "procedural generaiton as an aesthetic" by pointing out that most humans believe that the purpose of text generation is to "[write] prose that we would have written ourselves". Kathryn Hume believe that text generation would be better off focusing on other goals instead:
[W]hat if machines generated text with different stylistic goals? Or rather, what if we evaluated machine intelligence not by its humanness but by its alienness, by its ability to generate something beyond what we could have created—or would have thought to create—without the assistance of an algorithm? What if automated prose could rupture our automatized perceptions, as Shklovsky described poetry in Art as Device, and offer a new vehicle for our own creativity?
Now, the main drawback in this approach is that it is an admission of defeat...or at least, an admission that your work is only intended for consumption by a niche. A few people might love to "rupture ... automatized perceptions" and would embrace machine-generated texts, with their warts and all. However, I doubt that the vast majority of humans would embrace generated texts so readily. After all, Dadaism, cut-ups, and Oulipo did not take over the literary world. Even Kathryn Hume agree that big businesses prefer to invest in "human-like" text generation: "[I]nvestment banks and news agencies like Forbes won’t pay top dollar for software that generates strange prose".
Machines are not perfect. Code can be fallible. Programs can generate output that is dull, boring, and uninteresting.
However, humans are also not perfect. They can be fallible. They generate output that is dull, boring, and uninteresting. The difference is that humans are (usually) able to judge the quality of their work, and determine whether their output is good or bad. The 'good' output of humans are the ones that get published. The 'bad' output are ignored and forgotten. Sometimes, humans can choose to 'edit' the work of other humans, turning 'bad' output into 'good' output.
In the same way, human curators can also review the output of the machine. They can generate hundreds of stories, and review them to find the ones most promising. The curators could then select "good" stories and show them to the general public, while editing or throwing away the "bad" stories.
The very act of curation also adds meaning to the computer-generated work, as the human curators use their own ideas and beliefs to decide what stories to select and what to reject. When you read the computer-generated work based on human curation, you are reading a work with both machine and human influences.
This approach is far more common than is generally acknowledged. Every time someone gleefully posts the evocative output of a Markov chain, there are other outputs that are filtered away, never to be seen by another human being. Even human journalists, whenever they cover the NaNoGenMo competitions, do not feel the need to copy and paste whole computer-generated novels into their news articles. Instead, they only choose key quotes from the computer-generated novels...quotes that they feel are interesting enough for their audience to read.
You can see this type of approach on full display at CuratedAI, a website that describes itself as "[a] literary magazine written by machines, for people". However, the human programmers can choose what works to send to this literary magazine, and it is perfectly fine for the human programmers to 'lightly edit' the generated output. The machine-generated works on CuratedAI can be interesting to read, but that's because the human editors are there to ensure it stays interesting.
The main drawback with the Human Curation approach is the manual labor involved in the process. While it is easier to "edit" a computer-generated work than it is to create the work in the first place, the human must still play a rather overt role in this "creative" process. There's also a philosophical question: if machines generate literature, and then humans heavily edit the literature before publishing it, then was the final output 'really' computer-generated?
We are currently unable to program in commonsense knowledge in our algorithms...however, this does not serve as a complete roadblock to text generation. Programmers are free to use the approaches I outlined above to ensure that the machines are able to generate text filled with meaning and creativity.
Each approach has their drawbacks and flaws. There is nothing scary about drawbacks and flaws, so long as you are aware of them. Trade-offs must be made all the time in software development. Text generation is no different.
Just like every other blog post on text generation, this blog post is generated by a computer. Fairly simple text generation...the introductory and conclusion sections are fixed, but the three "approaches" are randomly shuffled. Here's the source code. The algorithm is very lazy but I've been spending a lot of time writing the content for this blog post, and I'd rather push something out of the door. Sorry. I'll see if I can try to think of some more creative text generation algorithms in the future.
This blog post was inspired by a comment discussion thread on the the blog post NaNoGenMo: Dada 2.0. James Kennedy and me (as "Realist Writer") discussed whether text generation could reach human standards and how to resolve the "friability of semantic trust" that could exist whenever robots produce terrible stories.