Mechanizing Surrealism: A look at absurd_gloss

#nlp #python #linguistics #todayilearned

Lately, I've been spending some time exploring the stranger corners of programming. I've been playing with natural language processing tools in Python, and last week I wrote about finding unique words in Alice in Wonderland.

While I was working through that question, I found an enigmatic method in the documentation for the Nodebox English Linguistics Library. It was called absurd_gloss and purported to return "an absurd description for the word."

Not entirely sure what that meant, I tried it out a little for myself:

>>> en.noun.absurd_gloss('castle')
'a place where you are just as comfortable and content as if you were home'
>>> en.noun.absurd_gloss('aardvark')
'primitive reptile having no opening in the temporal region of the skull; all extinct except turtles'
>>> en.noun.absurd_gloss('truck')
'handcart for moving a load of laundry'
>>> en.noun.absurd_gloss('computer')
'a subsidiary organ of government created for a special purpose; "are the judicial instrumentalities of local governments adequate?"; "he studied the French instrumentalities for law enforcement"'
>>> en.noun.absurd_gloss('cow')
'sure-footed mammal of mountainous northwestern North America'
>>> en.noun.absurd_gloss('pelican')
'chiefly web-footed swimming birds'
>>> en.noun.absurd_gloss('sleep')
'the London residence of the British sovereign'

I found the results a little surreal. Somewhat related to the original input in an uncanny way, and sometimes way off base. So what was happening here?

I dug into the source code and found the method definition:

def absurd_gloss(q, sense=0, pos=NOUNS, up=3, down=2):

    """

    Attempts to simulate humor:
    takes an abstract interpretation of the word,
    and takes random examples of that abstract;
    one of these is to be the description of the word.

    The returned gloss is thus not purely random,
    it is still faintly related to the given word.

    """

    from random import random, choice

    def _up(path):
        p = hypernym(path, sense, pos)
        if p: return p[0][0]
        return path

    def _down(path):
        p = hyponym(path, sense, pos)
        if p: return choice(p)[0]
        return path

    for i in range(up): q = _up(q)
    for i in range(down): q = _down(q)
    return gloss(q)

Okay, let's unpack this. First, the comment "attempts to simulate humor." 😂 Teaching computers to be funny is probably behind only caching and naming things in terms of hard problems.

q is the input word, sense specifies which meaning of the word to use, from most to least common (defaulted to the most common). pos is the part of speech to use (defaulted to nouns). And then there are numbers for up and down.

The way abstract_gloss works is that it takes the first hypernym of an input word, then the first hypernym of the hypernym, etc.

What's a hypernym, though?

Great question. The documentation describes it as an "abstraction," but it's probably most helpful to give examples:

>>> en.noun.hypernym('cloud')
[['physical phenomenon']]
>>> en.noun.hypernym('animal')
[['organism', 'being']]
>>> en.noun.hypernym('person')
[['organism', 'being'], ['causal agent', 'cause', 'causal agency']]
>>> en.noun.hypernym('castle')
[['mansion', 'mansion house', 'manse', 'hall', 'residence']]

So it takes those successively the number of times specified in the up argument (3 by default) and then takes random hyponyms the number of times specified in the down argument.

What's a hyponym?

A hyponym is an example of a given word. Let's look at more examples (return values truncated for brevity):

>>> en.noun.hyponym('cloud')
[['coma'], ['nebula'], ['aerosol'], ['cosmic dust'], ['dust cloud'], ['mushroom', 'mushroom cloud', 'mushroom-shaped cloud']]
>>> en.noun.hyponym('animal')
[['pest'], ['critter'], ['creepy-crawly'], ['darter']]
>>> en.noun.hyponym('person')
[['self'], ['adult', 'grownup'], ['adventurer', 'venturer']]
>>> en.noun.hyponym('castle')
[['Buckingham Palace']]

absurd_gloss takes successive hypernyms and then successive hyponyms and then returns the gloss (or definition) of the last hyponym. Because the hyponyms are chosen at random, you can get different results for the same input. And the more general a word you start with (and thus the more hyponyms there are) the bigger the potential for the result going way off the rails.

I added some print statements to trace through the process:

>>> en.noun.absurd_gloss('clown')
Hypernym : clown
Hypernym : fool
Hypernym : simpleton
Hyponym : person
Hyponym : perspirer
Resulting word: perspirer
'a person who perspires'
>>> en.noun.absurd_gloss('clown')
Hypernym : clown
Hypernym : fool
Hypernym : simpleton
Hyponym : person
Hyponym : watcher
Resulting word: browser
'a viewer who looks around casually without seeking anything in particular'
>>> en.noun.absurd_gloss('pelican')
Hypernym : pelican
Hypernym : pelecaniform seabird
Hypernym : seabird
Hyponym : aquatic bird
Hyponym : seabird
Resulting word: gaviiform seabird
'seabirds of the order Gaviiformes'
>>> en.noun.absurd_gloss('person')
Hypernym : person
Hypernym : organism
Hypernym : living thing
Hyponym : object
Hyponym : part
Resulting word: language unit
'one of the natural units into which linguistic messages can be analyzed'
>>> en.noun.absurd_gloss('universe')
Hypernym : universe
Hypernym : natural object
Hypernym : whole
Hyponym : concept
Hyponym : conceptualization
Resulting word: approach
'ideas or actions intended to deal with a problem or situation; "his approach to every problem is to draw up a list of pros and cons"; "an attack on inflation"; "his plan of attack was misguided"'
>>> en.noun.absurd_gloss('animal')
Hypernym : animal
Hypernym : organism
Hypernym : living thing
Hyponym : object
Hyponym : je ne sais quoi
Resulting word: je ne sais quoi
'something indescribable'

There's certainly more interesting work you could do by tuning the up and down values to make the results more or less abstract, but I hope this was an interesting look at a method which at first seemed completely nonsensical. 🔮🦄

This is just one of many interesting tools available in the Nodebox English Linguistics Library, which is worth exploring more if you found this interesting.

DEV Community

Mechanizing Surrealism: A look at absurd_gloss

Top comments (0)

Read next

Convert an Excel dataset into a SQL insert statement

Situs Togel Terlengkap Di Indonesia 2024

Building an Article Generator with LangChain and Llama3: An AI Developer's Journey

Day 27: Regularization Techniques for Large Language Models (LLMs)