When it came time to sign up for my first college classes, I was completely overwhelmed by the choices. Numerous biology classes, Spanish classes, chemistry classes, but worst of all, there were so many computer science classes. After researching my options, I decided to start with COMP 401: Programming Fundamentals. Like the name suggests, it focused heavily on the fundamentals behind programming languages. Things like syntax, best practices, but most importantly, data types. When I took this class, my professor primarily focused on teaching these skills in Java. While nearly every college begins introductory programming classes by teaching the concept of data types, they have done something strange over the last few years; they have moved to using dynamically-typed languages as student’s introduction to programming languages.
Let’s look at the declaration of a new variable in both Python (dynamically typed) and Java (statically typed).
Python
new_number = 12
new_string = '12'
Java
int new_number = 12;
String new_string = "12";
Upon first glance, the Python version is much easier to look at because it looks much more similar to English. However, it is oversimplifying some very fundamental programming language concepts. To a new developer, it is not clear that there is a distinct difference in the way new_number
and new_string
are stored in memory. To them, one has quotes and one doesn't.
A new developer may try to run something like:
Python
print(new_number + new_string)
# Raises a TypeError
and they may wonder why an error is thrown. It is not immediately clear why this particular expression doesn't work. Upon reading the error messages, they will eventually discover the issue, but this becomes more troublesome as a codebase get larger.
The issue: Think of a String like a sentence in English. It's represented like an English word in memory. An integer is represented like a mathematical number that you can do operations on (addition, subtraction, etc.). This example essentially illustrates the computer trying to add the English version of "twelve" to the mathematical number 12.
Let's look at a more complex example:
Python
basic_list = [12, 5, '3', 8, '7']
sum = 0
for n in basic_list:
sum += n
print(sum)
To an experienced programmer, it is immediately clear why this is an issue (you can't add integers to strings, as they are represented differently in memory). However, a new developer may not see this distinction. If they are learning Python as their first programming language, this oversimplified abstraction of how data is stored in memory can be extremely confusing.
Let's take a look at a similar program in Java:
Java
List<Integer> basic_list = Arrays.asList(12, 5, "3", 8, "7");
// The program halts here before the
// rest of the code runs and throws and error
int sum = 0;
for (int n : basic_list) {
sum += n;
}
System.out.println(sum);
Java, since it is statically typed, doesn't allow you to make the mistake of appending a String to an Integer List. The developer can immediately see that basic_list
should only contain Integers. For a newer developer, this may seem annoying at first, but it prevents hard-to-understand logic errors later in the development process.
While the difference is subtle, as a newer developer, this distinction is huge. I have had many friends and students come to me after receiving their first programming assignment, baffled by why their new program isn't returning the expected results. 99% of the time, Python typing issues are the root cause of the problem. Starting with a safe, statically typed language like Java (even though many see it as outdated) is the way to go if you want your students to understand typing.
Stay tuned for part 2 of the series coming soon!
Top comments (3)
I like types. Even in a dynamically typed language I try to practice Type Driven Development. Once upon a time, I even liked Java exactly because of the (false) sense of security of compilation (static type checking) - to the point it put me off learning JavaScript (I got over it). However once I had a chance to use OCaml and Haskell, I finally got a taste of what a real type system could accomplish - but those benefits come with a learning curve.
I don't share the mainstream enthusiasm for Python as a first language [1].
On the surface the argument to start with a statically typed language for the sake of "safety" makes sense but people who have dedicated their career to computer science education have come to a different conclusion; The Structure and Interpretation of the Computer Science Curriculum:
As it is How to Design Programs 2e (HtDP) uses 5 student languages each one relaxing the constraints on capabilities more than the previous one but none include static type checking - even though Typed Racket exists.
That said, "those pesky parentheses" did turn out to be a real obstacle for some students - which is why Pyret was created. It's used in Programming and Programming Languages (PAPL). And Pyret has an optional static type checker.
The real issue is that educational institutions often feel compelled to teach programming with a résumé-friendly programming language, i.e. a one that a significant number of employers actually use. Perhaps if that was less of an issue it wouldn't matter if programming was taught without static type checking because that can be added next time.
The issue is that when dynamic typed languages like Python or JavaScript are used as first languages the student may not venture beyond that first experience and miss out on capabilities that other languages have to offer. Similarly one shouldn't get overly fixated on "static type checking" as an essential language feature - there are dynamically (but strongly) typed languages that have other things to offer (e.g. Erlang, Elixir).
[1]:
Austin, 12 April 2001
Edsger W.Dijkstra
To the Budget Council concerning Haskell
Great article.
Looking forward for the next one.
Thank you! It means a lot!