Why don't programming languages have 'String Subtraction' and 'String Division'

twitter logo github logo ・1 min read

Hey, I'm back, wonderful lurkers and/or posters of Dev!
I have a really interesting question (as stated in the title), why can't you use "String Subtraction", and "String Division" in most programming languages? (I say 'most' because I know there are some languages which allow the former).
Why can't we say, do this:

"Hello, World!" - ", World"
// "Hello!"

or what about

"Hello!" / 3
// "He"

?
Thanks,
and Cheers!

twitter logo DISCUSS (6)
markdown guide
 

There a couple practical answers on here already, but I think the theory involved in this question is also very interesting. These operations of subtraction and division have specific definitions for the way some data type behaves in response to applying them. The field of Category Theory is one place we can look for those definitions.

A data type that can be subtracted belongs to the category of Rings, and rings must satisfy the additive inverse law a - a = -a + a = zero. To me, ’hi hi hi’ - β€˜hi’ seems intuitively like it should return ’ β€˜, i.e. remove all the occurrences, but that breaks the law.

A data type that can be divided belongs to the more specific category of Euclidean rings. It turns out that there is just no way to define subtraction and division for strings in a way that follows the rules in a sensible way that makes it easy for people to pick up a given language (the theory is a little out there and abstract but it does have major implications when it comes to how easy a language is to learn and teach).

What I think is interesting is that various languages have chosen to overload + to perform what are actually two totally different theoretical operations depending on the context. When you add two numbers you are performing addition, which is defined for a category called Semirings (semi because addition but no subtraction). When you use it to concatenate strings you are performing the append operation as defined for the Semigroup category.

This goes counter to what I said earlier about consistency, teaching, and learning, but it’s written into history at this point and we just live with it πŸ˜„. My favorite language, Elm, uses + for addition and ++ for appending.

So, to sum up, + on strings isn’t technically addition, but really a choice language designers made to express the separate concept of appending. Mathematical operations like addition, subtraction, and division aren’t definable for strings in a way that makes sense broadly to the average developer with respect to both other languages and other types within a given language. I hope this was interesting!

I’ve included some links to the definitions for the categories I mentioned in Purescript. In purescript all of these rules are explicitly defined, and I have found that to be valuable in learning these concepts.

github.com/purescript/purescript-p...
github.com/purescript/purescript-p...
github.com/purescript/purescript-p...
github.com/purescript/purescript-p...

 
 

The 'subtraction' thing is a sub-string removal, which, if it is going to be useful, is too complicated to express as a binary operator. Most languages provide some kind of function call to handle this.

The 'division' thing is just a type of slicing, which most languages have in some form as well.

Both can also be easily implemented in any language that allows you to get individual characters at arbitrary indexes in the string (which almost all modern languages can do).

 
 
β€œHello, world!”.replace(β€œ, world!”, β€œβ€)

β€œHello!”[:len(β€œHello!”) / 3 - 1]

Because we don’t need them.

 
Classic DEV Post from May 11

Handling Array Duplicates Can Be Tricky

Handling Array Duplicates Can Be Tricky

PDS OWNER CALIN (Calin Baenen) profile image
I am a 13 (as of Oct 30 of 2019) yr/o developer who makes projects in languages like: Java, HTML, Python, JS, CSS, C, and am working on learning C++, and C#.