Multilingual wordsearch generator

#javascript #webdev #gamedev #node

Why?

As a language1 learner most recently interested in Korean/한국어, I've been practicing a lot of new vocabulary. Practicing vocabulary can become tedious if I just follow the same kinds exercises (ex. Duolingo, Memrise, physical flashcards) over and over again.

One twist on the flashcards strategy for vocabulary would be a puzzle like a crossword or a wordsearch. Between the two, a crossword is much harder to improvise, as it requires words to share certain letters/characters so they can be organized with crosses.

So, given that they'd be a bit easier to create quickly, I chose to try using custom wordsearches. The plan was to pick a handful of words, maybe by topic, or by difficulty, or randomly, and put them in a wordsearch puzzle together.

... But I couldn't find a good wordsearch generator. There were a bunch I found, but none of them supported different character sets (as far as I cared to figure out).

That's how I decided to make my own.

How it works

At the core of being able to support different languages in my wordsearch generator is the alphabets.json file, which currently looks like this:

alphabets.json

{   
    "en": {
        "ranges": [
            [97,122]
        ],
        "upper_ranges": [
            [65,90]
        ]
    },
    "es": {
        "ranges": [
            [97,122],
            [225,233,237,243,250,252]
        ],

        "upper_ranges": [
            [65,90],
            [193,201,205,211,218,220]
        ]
    },
    "ko": {
        "ranges": [
            [44032,55203]
        ],

        "upper_ranges": [
            [44032,55203]
        ]
    }
}

Valid characters from which to pick random ones when initially randomizing the wordsearch cells are defined by "ranges", though they're more like sets. If I list two values, that means all unicode points between a and b. Any other number of values means all these unicode points a,b,...,z. Being able to list noncontiguous points makes it easy to include sparse characters like accented vowels used in Spanish (see es).2

Once the language alphabet can be defined with this system, the generator is able to fill cells with random characters and later populate some of them with hidden words that the user provides.

My placement algorithm for the hidden words is essentially brute-force. I pick a direction and then try random spots until I find available space, with a maximum number of attempts, beyond which the word is skipped.

How to use it

The current best-supported method of providing configuration/description/input for generating a wordsearch is via a json file, like so:

example_en.json

{
    "language": "en",
    "size": 15,
    "words": [
        "apple:a fruit",
        "skull:head bone",
        "scissors:cuts paper",
        "boat:floats on water",
        "bird:feathers and beaks"
    ]
}

Generating a puzzle like:

Extension

Note that, hopefully, this implementation is extensible enough to easily support other languages (add entries in alphabets.json), including arbitrary character sets.

I'm also considering making a printer-friendly output option in the webpage example so that the generated wordsearch is easy to make into a physical worksheet.