geraldew

Posted on Jun 6, 2021

Name Coincidences and the Occasional Coder

#python #programming

One of the perils of a coding life in which you don't spent either ALL or MOST of your time in a single language is that you may have a lot of trouble working out when things with the same name are so named for important reasons and when it is merely coincidence.

For this example, I will use Python as that is the language I've recently been dealing with, but the issue is the same across many languages.

In short, that issue is:

when you're not VERY familiar with a language, it's hard to tell the difference between arbitrary names of identifiers and the names of language/library elements.

That's a big problem when you're reading someone else's code - perhaps to learn how to achieve something you need in your own program. But, one benefit of age is that I know the same effect will occur for me and my own code when enough time has passed since I wrote it.

The example I will use is something I'm currently trying to learn from. It can be found in this online article.

How to setup correctly an application with Python and Tkinter

By the way, to state the hopefully obvious: this is not a critique of the code itself, nor the generosity of the person who wrote and shared it.

Here's the code:

import tkinter as tk
from types import SimpleNamespace
from queue import Queue
from enum import Enum
from threading import Thread

class Messages(Enum):
    CLICK = 0

def updatecycle(guiref, model, queue):
    while True:
        msg = queue.get()
        if msg == Messages.CLICK:
            model.count += 1
            guiref.label.set("Clicked: {}".format(model.count))

def gui(root, queue):
    label = tk.StringVar()
    label.set("Clicked: 0")
    tk.Label(root, textvariable=label).pack()
    tk.Button(root, text="Click me!", command=lambda : queue.put(Messages.CLICK)).pack()
    return SimpleNamespace(label=label)

if __name__ == '__main__':
    root = tk.Tk()
    queue = Queue()
    guiref = gui(root, queue)
    model = SimpleNamespace(count=0)
    t = Thread(target=updatecycle, args=(guiref, model, queue,))
    t.daemon = True
    t.start()
    tk.mainloop()

So what's the problem here?

Well, there are several ways in which that example uses the exact same name for things that could have had separate names. This causes confusion for the language-newbie reader who doesn't yet know when those correspondences are vital and when not.

Let's start at the top of the code sample. A Python novice probably won't already know the libraries being used:

from threading import Thread
from queue import Queue
from types import SimpleNamespace

nor the specific features being imported from them. Do note that this is true even if you're an experienced programmer. Knowing about concepts such as queues and threads won't mean you know the named features in these particular Python libraries. So this is a background context for this issue. It means the reader will already be guessing which text tokens will come from those imports.

This is especially so where "dot" naming gets used e.g. t.daemon and model.count. If I've understood correctly one of those is referencing a fixed feature from a library and one is quoting a name that was set in this very code example. Admittedly this is touching on a syntax fluidity in Python that is either a part of its brilliance or part of its uncompilability (ouch, sorry to coin that).

The main problem though is that the programmer has needlessly used coincident names for variables. If you use a name-match-highlighting editor then these coincidences "light up" as you put a cursor on them. And thus, the "why is that also there?" questions come to mind when reading the code.

The way that Python handles variable naming scope is different to many (most?) other languages. I'm from a Pascal background (yes, I'm old) where scope is quite strict in terms of the program text.

Thus if I see something with the same variable name in different scopes then I'm inclined to interpret that as being the exact same reference - because in many languages, an inner structure can "see" all the variables in the outer scopes (at least until you hit the point of modularity).

The first name coincidence that caught my wondering eye was:

    return SimpleNamespace(label=label)

and that was before I looked up SimpleNamespace to understand what kind of thing was happening there. It's inherently obvious that any place where label=label can be meaningful code can only be because one label is not referring to the same thing as the other one.

It's like visiting a village where all the sons are named "John". Yes that's possible to do, and yes it simplifies the problem of remembering what any of them are called, but it's a bit crazy to think that it won't make conversational references tricky.

The next ones to confuse me were the plainer coincidences, for queue and root - which get created in the main, but are passed in to the functions. The exact same name is being used for the outer calling use and the inner passed-in use. Understanding whether these are "really the same" requires a clear understanding of two things:

scope;
run-time behaviour.

I'm not about to fully discuss how these work in Python .. because frankly I still don't entirely understand it. Which is to say that I can always make sense as I write then debug code but Python's rationale has not yet become native-think inside my head, as a mental basis while writing my code. Also, fully understanding these aspects of Python would be a full article on its own.

As an aside, I suspect that some of the things that make Python quite the success that it has become are the same things that "don't make sense" to me. Life is compromise - ce la vie.

We'll see below that these same-names don't have to be so, so why have they been written here as the same-name? Well, I think I do "get" why people write like this - in that mainly I expect that they think this is keeping the code looking simpler. The catch is that this is only true where the language/library knowledge is already given.

My point is that this is part of the slide to write-only code. This might be the very, very thin end of a very long wedge - but the slide is the same one.

Here's how the code looks after I went through and renamed references to make each subset usage distinct.

# barebones Tkinter Two-Thread Example
import tkinter as tk
from types import SimpleNamespace
from queue import Queue
from enum import Enum
from threading import Thread

class Messages( Enum):
    CLICK = 0

def updatecycle( p_guiref, p_model, p_queue):
    while True:
        i_msg = p_queue.get()
        if i_msg == Messages.CLICK:
            p_model.count += 1
            p_guiref.sns_label.set( "Clicked: {}".format( p_model.count))

def gui( p_root, p_queue):
    i_label_a = tk.StringVar()
    i_label_a.set( "Clicked: 0")
    tk.Label( p_root, textvariable=i_label_a).pack()
    tk.Button( p_root, text="Click me!", command=lambda : p_queue.put( Messages.CLICK)).pack()
    return SimpleNamespace( sns_label=i_label_a)

if __name__ == '__main__':
    i_root = tk.Tk()
    i_queue = Queue()
    i_guiref = gui( i_root, i_queue)
    i_model = SimpleNamespace( count=0)
    i_t = Thread( target=updatecycle, args=( i_guiref, i_model, i_queue,))
    i_t.daemon = True
    i_t.start()
    tk.mainloop()

The method I've used there is simple.

a variable that is Internal to a scope is prefixed with "i_"
a variable that is Passed in to a scope is prefixed with "p_"

This means I can instantly see where/how the references are local to scope.

Not that these clarifications can explain everything. Ultimately, with a language such as Python there's so much going on under the hood that a surface read of the references doesn't go far.

The example of that here is the presence of (indeed creation of) sns_label in the line:

    return SimpleNamespace( sns_label=i_label_a)

and its subsequent usage in

            p_guiref.sns_label.set( "Clicked: {}".format( p_model.count))

which despite the "Simple" in the name, is a sign of some deeper inside-the-interpeter goings on for which an understanding is required of the differences between Python functions and classes.

That's something too complex to cover here - but in this context the problem is that they look the same when/where they are used, a base fact of Python that the coder cannot change.

In this example though, it can be hard for a Python-language-newbie to see how/why the return statement creates a reference that will be de-referenced from p_guiref.

For this particular code example, it was only after removing the unnecessary coincidences that I could pinpoint which part of this I would need to investigate fully - so that I would reach a position where I could have something I've learned from this code example and apply it to my existing Tkinter application.

FWIW I'm trying to work out how to add a minimal non-blocking behaviour to my application so that the interface doesn't freeze while a long-running action is in progress and also for that action to show progress updates to the Tkinter-drawn interface. At time of writing I'm still thinking about how that will work.

p.s. of course, in a way, my need to re-mark the code so that I am sure I know exactly what is going on, and the process of me doing that is itself an excellent way to learn the code. And if I hadn't felt I had to do that, would I then understand the code as well? There's no solving that paradox. Sometimes that is very worthwhile, and sometimes that just slows me down and annoys me. In this case, I'm coding for private reasons, so I'm not too bothered either way. But it seemed like a good example to talk about the issue.

and as a personal example, and a reminder that Python is merely used as an example here, I regularly go through a similar process for the main language that I write in - SQL - and there I'm equally annoyed and yet strangely grateful that no-one else seems to format SQL as I do (and that's a whole other can of worms).

p.p.s.
For a classic example of how Python produces strange name-place coincidences, look at the syntax for named tuples.

Point = namedtuple("Point", "x y z")
this_point = Point(1, 2, 3)

It can be hard to understand why in the first line we need to quote the name of Point twice, once as a string and then also as the named variable. To see how that question gets answered, see these examples:

p.p.s. something that I didn't touch on in the above is the languages/libraries where it is common practice to use coincident names that merely vary by their capitalisation (yesJavaLookingAtYou). Similarly, there are many people who think this is a great thing while others respell that as "grate".

DEV Community

Name Coincidences and the Occasional Coder

Top comments (0)

Read next

Cloud-based Tax Software vs. Tax Software Hosting Solutions

Implementing Golang's chan in TypeScript with @harnyk/chan

Event Loop in JavaScript: How it Works and Why it Matters

Effectively Marketing Devtools with Educational Content