Discussion on: Using Machine Learning To Generate Human-Readable News Articles

View post

Okay, you find similar paragraphs from different texts with a reasonable approach. But the title is really misleading: It's not text generation when you just concatenate those paragraphs of different texts. The article is well written, but what you're doing is quite questionable, i.e. you also don't really seem to understand what is happening there.

Tariq Ali • Apr 19 '17

I appreciate your concern.

When I say "text generation", I'm referring to all processes by which a computer is able to write out text at least semi-autonomously (so not just typing words out on a keyboard). I admit that this is an exceedingly broad definition (as it includes Markov chains, neural networks, world models, Mad-Lib templates, etc., etc.), but I like this "big tent" definition because trying to get computers to write human-readable text is a very difficult task regardless of the method chosen.

I may want to host a discussion on dev.to on what should be the proper definition of text generation though, and whether the definition should be made more restricted to avoid people getting misled, or if the "big tent" definition should be retained. I actually like your term "concatenate" to describe certain approaches to text generation, since it makes it clear that the computer isn't being "creative" in writing out this text, but simply stitching together other people's texts.

As for "don't really seem to understand what is happening there", it is true that I may have made a bad assumption. My reason for declaring that the program generated news articles is that the final text that has been generated/concatenated appears very similar to that of a news article. Humans has a tendency to engage in have a tendency to see patterns, even when none exist -- apophenia. So if humans see something that is similar to that of a news article, they will treat it as if it was a news article. At least, that was my assumption.

This was a bad assumption, as every human is different and just because I view the text as an article doesn't mean that every human will view the text as an article. Feedback from the project seems to be very positive (suggesting that most people did treat the text as articles), but one person has pointed out that they saw the generated/concatenated text as a collage instead of an article. This suggest that I need to be more cautious in the future, and if I am to make assumptions about humans, I need to study human psychology in depth first.

It is very important to prevent people (including me) from inadvertently hyping up technology, thereby leading to a hype cycle and an inevitable AI Winter. Therefore, I really appreciated the feedback that you given me and will definitely implement it in my next experiment. Thanks.