Skip to content
loading...

How did linguistics influence programming?

github logo ・1 min read  

Photo by Ben White on Unsplash

I know that mathematics influenced programming big time! And I'm sure the words big time are not enough to describe that.

But I was wondering how did linguistics influenced programming. Is it a too general question to ask?

twitter logo DISCUSS (15)
markdown guide
 

Various grammars come up with some frequency, particularly with regular expressions and query languages. Designing languages and building parsers, interpreters, and compilers naturally entails getting somewhat philosophical about the concept even if you don't dive all the way into programming language theory and its neighbors which have a variety of intersections, overlaps, and dialectics with linguistics. And while its impact hasn't really been noted outside academic circles, semiotics also has a lot to offer practitioners in a field which is fundamentally about the organization and transmission of information.

For a specific case, Larry Wall was a linguist before he became a programmer and that had a big influence on Perl.

 

I think that the advances in Linguistics of Noam Chomsky and others were a cornerstone of Compiler Theory.

Recently Semantic computing is also the basis of the last industrial revolution (based on knowledge discovery and knowledge management).

So... how much has linguistics influenced programming ? A lot !

In daily tasks you can see it also. We need to code in such a way that it can be understandable not only by computers, also by humans.

For example, using negated names for boolean variables is not good. Try to read this:

if not not_exclude_TERM:
  ....

Your brain need extra time to understand that code. It is useful to name all boolean variables in positive, and also the checkbox labels in the user preferences.

That's only one example.

 

Naming things is one of the hardest thing to do in programming. so it's definitely very important. Going up the abstraction layers, giving the right/wrong/adequate name to things significantly impacts how we program, design apps and architecture systems.

A good example is Alan Key regretting calling Object Orientation rather than Message Orientation.

Also the fact that english was the "pioneering" language has also a big impact.

Things always get lost in translation, so it becomes crucial to learn english and having it as a second language also impacts in naming things, not to mention encoding.

I also feel that conveying linguistic meaning to mathematical concepts if an interesting cross-over in functional programming and people (including me) are finding a way more expressive way to program.

And last but not least, category theory is also something to take a look. Naming things not only categorizes things but also gives meaning, and meaning could be relative to context and it changes over time, place, cultures, languages (speaking with the little knowledge I have on the subject, it's a heavy thing to study and I still didn't wrap my head around it).

 

Automata theory is a really interesting field of study which is closely related to formal language theory, and forms the bedrock of our theory of computation. My attempt at a simple explanation: automata theory describes very limited, theoretical computers, that are able to recognize different languages. Regular expressions are able to be solved by finite state machines, context free languages can be solved by push down automata, and the most powerful automata are Turing machines. By studying Turing machines, we can figure out what problems it is able to solve, and what problems it cannot solve. When we call a programming language "Turing complete", we say that the language can simulate a Turing machine, and thus is technically able to solve all problems a Turing machine can solve (no comment on how difficult it would be to write that solution!).

A basic understanding of automata theory, while not necessary for Software development, gives one a better understanding of computation as an abstract concept. Finite state machines are seeing a resurge in popularity, as people have discovered that they can model UI states very well. This whole field wouldn't exist (at least not how we know it) without the work of formal linguistics!

 

Indeed.

The main influence of Automata theory in programming was in the Compilers Theory.

It is used in the Lexical Analysis part, and it is used in the construction of parsers and compilers.

The finite automata allowed to implement such stuffs in a generic way.

As you say, the field wouldn't exist as we know it, but maybe another way.

Maybe the parsers, compilers would be very ugly. And the code would be full of conditional constructs (if, then, else).

 

This post may be of interest

On this topic, I find it very interesting that Ruby, a pioneer in “simple English-esque syntax” was developed by a non-native English speaker (and Japanese is pretty far from English!)

 

English and Japanese share the concept of "subject" "object" "verb", expressed as discrete "words", as well as plenty of other categories, even if they put things in a slightly different order. They are not that different. Also, besides for borrowing code words from English, the syntax of most computer languages has little to do with English syntax, and even AppleScript, which was originally designed to read as if it were English starts to diverge a lot when you start doing complex things.

Regarding Ruby, Matsumoto was a self taught programmer before entering one of the most prestigious university in Japan where he studied programming languages and compilers. As influences for Ruby, he cites Perl, Smalltalk, Eiffel, Ada, and Lisp. A lot of computer languages syntax does not involve an explicit subject since most of the commands are "commands" and use the imperative, so here again, there is little connection between natural languages and the codes we use to command our machines.

What is interesting is attempts at creating languages for children that actively use native language code words, and sometimes syntax (or at least word order) to keep the intuitive character of the language. But it looks like such languages are been put aside to be replaced by "block" languages a la Scratch (localized in plenty of languages).

Back to the OP, linguistics is the field that studies languages that evolved as means of communications, among other things, in context rich environments (natural languages), while computer languages are means of communication that were designed for context poor environments (or rather for devices that did not have autonomous access to that context). They were especially designed to solve a given category of problems, while natural languages are not specifically meant to solve any given category of problem (hence "poetry").

I found two links that kind of relate to the discussion:

actfl.org/news/position-statements...

and

researchgate.net/publication/23478...

but not much else...

 

Interesting discussion, but it seems that somehow implies no relation between natural languages (refered in the actfl link as world language) and computer languages.

Hence it seems to suggest no relation between linguistics and computer languages. Isn’t ?

Let me point out that even when natural languages and computer languages have different goals, it doesn't mean that knowledge mankind has achieved studying the former (linguistics) had not been useful for the theory behind the later.

I think the ACTFL made that statement in an attempt to let clear the mission of the organization and the benefits the students will achieve with their courses. The statement seem not to be directed to linguistics or computer scientists just to regular people that want to benefits from their training and might be confused.

Just note that they are not comparing the nature of the languages, they are comparing computer coding course and world language course. Given that the skills expected from both courses is different, of course that the courses are not equivalent.

Nevertheless, even when computer languages are strongly scoped by the context (computer hardware, problem domain, purpose) the theory behind has a lot of stuff in common.

As a matter of fact, computer languages have an old classification. They are told to be High level languages when they resemble more the language of the humans and Low level languages when they are near to the machine code.

We might say then that Scratch, or other easy visual languages are high levels (for the limited scope they are build to) if humans can use them almost without coding.

If you like to go deeper into this without going into the theory of automata, compilers, etc. (more math here) then I can advice a check from the perspective of linguistics.

Take this book: Knowledge Representation and the Semantics of Natural Language and check that even when some natural languages have more than 10000 vocabulary words and grammatical structures (as pointed by actfl in your link) there are just a few tens of semantical structures that you can use to express something. They even teach you how to build a graph with them to analyze documents at the semantic level.

Once you have learn all those semantics structures, then check the programming paradigms, of the computer languages that you use.

I think you might be surprised to identify that many computer programming concepts are build to implements those semantic structures. For example OOP takes the concepts of class, instance and the relations between them (instantiation, inheritance, ...).

Nevertheless, there are semantics structures related to TIME that might better represented in other programming paradigms like Event driven programming.

This correspondence did not happen by accident. It is precisely because all what you need to tell to a computer ..., you need to understand it first. And you need a symbol with meaning in your mind. You can speak many languages (and also a computer), but the meaning is one and obeys to certain rules given by how your mind works (ref: Gnosiology).

This topic is amazing to some degree that humans have debated by millennial whether the meaning exist overall and we discover it, or we create the meaning to model the reality. (ref: Realism)

It is not a coincidence that computer languages has to do with natural languages, they share a the common theory because they both come from human minds, and the same semantic structures were used in the process

Of course that is not the case of low level languages, but humans try to do the languages at a higher level, such as they can understand it.

It left to see what might happen if machines achieve conscience in the future. Would they have their own semantic structures ? Would they create their own programming languages and paradigms?

Maybe then they would say: -Human linguistics? Not related to our programming languages.

But I guess NO. Because it is told that the way the humans reason, acquire and classify knowledge is given by the way the complexity is organized in the universe. (ref: Systems_theory).

That way complexity is organized in nature is good explained in the first chapters of Object-Oriented Analysis and Design with Applications. I guess that If machines want to generate useful code to control this universe and survive, their minds need to handle the complexity with similar structures.

Thank you for the reply.

I did not suggest that natural languages did not influence computer languages, only that linguistics (ie the field of study that deals with natural languages) does not seem to have considered computer languages as belonging to its field and thus has likely not influenced their evolution much. Even if, as was mentioned earlier in the discussion both language groups share "meta" concepts like "grammar", "syntax", etc.

But the thing is, we don't describe computer languages with terms like "verb" or "object" (at least not in the grammatical sense of the word), and most of the terms we use to describe them are borrowed from mathematics (function, argument, etc.) which is another field that has little to do with linguistics (and certainly does not want to be seen as being influenced but such a "soft" science :).

Both language groups deal in symbolic expressions that are or are not allowed a given degree of expressiveness, and computer languages, as products of the symbolic work of humans are bound (for now) by the expressiveness of human languages.

As far as evolution is concerned, most computer languages (Lisp dialects are an exception, I don't know of others) are not allowed to easily create new structures that belong to that language, unlike natural languages.

There is a field that I am very interested in right now, which is language learning, that also does not seem to share much between the two groups. Computer languages are mostly taught through "rote learning", while natural languages, especially when communication is a requirement, are not. In other words, for computers, we need to learn (and are practically limited to) what is "grammatically correct" while for humans we need to learn what conveys meaning, which is limitless and includes only a small part of what is grammatically correct.

To conclude this reply, artificial languages are influenced by natural languages but I have yet to find an influence of linguistic research in the area of artificial language "design", which seems to me the question asked by the OP.

But the thing is, we don't describe computer languages with terms like "verb" or "object" (at least not in the grammatical sense of the word), and most of the terms we use to describe them are borrowed from mathematics (function, argument, etc.)

Of course that the grammatical is limited by practicals term. The languages include just what is needed at the syntactic and grammatical level in order to express the subset of semantics that is needed in the problem domain for which the language is designed.

I do not know in details all the computer languages in the list but are you sure there is no programming paradigm with such grammatical stuffs?

In practical terms, what we do is that we read the code and understand the meaning. That illuminated act is some kind of a matching at the semantic level.

If you read for example:

   order.add(item)

you have subject (order), predicate (add) and object (item).

... artificial languages are influenced by natural languages but I have yet to find an influence of linguistic research in the area of artificial language "design", which seems to me the question asked by the OP.

They gave you a generic way to design any language by mean of its grammar and generate automatically compilers and parser. Isn't that one influence ?

Also you may find you will find a lot more in the level of Semantics. Remember that linguistics also covers Semantics.

The past decade there was a lot of work in the field of Semantic Computing, check that work also.

...for humans we need to learn what conveys meaning, which is limitless and includes only a small part of what is grammatically correct.

Then you'll like the book about knowledge representation. When you see which semantics are behind every word you can better select which is the proper word to avoid ambiguity.

You might also find a lot of useful stuffs in General Semantics and Institute of General Semantics

Mathematics and computer science having little to do with linguistics seems to be more often the position of people with an interest in the former two than a generally shared perspective. It's true that programming languages are formally mathematical, but they're still languages with grammars and parsing and even room for implication and ambiguity and authorial expression now and then; and math or no, learners often find framing, for example, object orientation in terms of nouns and verbs more intuitive than staring at proofs.

Manufactured languages, software architectures, and specialized interfaces are much more limited in scope than natural languages, texts, and grammars. But there are still rules they conform to and tendencies they exhibit, so it's rather a waste of time to ignore the vocabulary and theoretical toolkit already developed for us by linguists and semioticians. And of course, treating natural languages themselves in mathematical terms is a topic of no small interest on either side of what dividing line exists.

I totally agree with you. I'm a failed mathematician and an unsuccessful linguist :) But I think the fair reply to the OP is that there are no proofs that linguistics as a scientific field has influenced computer languages anywhere close to the way mathematics have.

I'm re-reading all the replies and yes, there are plenty of proofs, as was referenced in the exchange. I was just stuck in my vision/experience of linguistics.

My understanding of the OP's question was more in the field of programming language design (as Ben seemed to hint) where an influence of linguistic research would have contributed to produce languages closer, in expressiveness and structure, to native languages. So I was considering only a very narrow aspect of both computing and linguistic research.

Thank you Dian and Yucer for the really interesting comments.

 
 
 

Naming things:

  • class, object, variable - nouns
  • function, method - verbs
  • interface - adjectives
Classic DEV Post from Nov 23 '19

Who are your favorite writers here on DEV?

Who are some people whose posts on DEV you particularly enjoy?

Anton profile image
Born into being.

Make better choices about your code and your career.

Create Your Account. 100% Free Forever.