Discussion on: Should a modern programming language assume a byte is 8-bits in size?

View post

Replies for: In short I guess you could summarize my stance as: "programming languages should by default empower conveniently getting things done with minimal c...

I can't think of the last time I had a bug caused by integer overflow. Does that happen to you a lot?

The cost of using arbitrary precision everywhere is way more than "a few cycles". You're adding overhead all over the place. Incrementing a number goes from a simple machine instruction that can be pipelined to a loop which will likely result in a pipeline stall.

You can't just allocate an array of integers because you don't know how much memory that will need. Increment the nth element of an array? That is potentially an O(n) operation now because it might increase in size. Or your array of integers could actually be an array of pointers to the structs holding your arbitrary precision integers. That DRASTICALLY slows things down on modern processors because your memory references have worse locality so your all-important L1 cache hit ratio goes down the tubes.

It's like making airliners at 10,000 feet instead of 30,000 feet to avoid the risk of cabin depressurization.

Erebos Manannán • Mar 23 '18

When you say "drastically" you mean "has literally no perceivable impact at all in most cases". The O(n) timing etc. means literally nothing if you're talking about a general scope. There are places where speed matters and those places are getting increasingly rare.

I follow the security scene a fair bit and especially there I keep reading about random pieces of software constantly running into integer overflow/underflow issues.

They ARE a cause for bugs when e.g. a developer thinks "well I'm just asking the users to input an age, and a normal human lives to be at most 100 years old so I'll just use int8" and then the user doesn't know or care about what constraints the programmer had in mind and tries to use the same application to catalog the ages of antique items, species, or planets.

"Premature optimization is the root of all evil" is a fitting quote for this discussion. Optimize where you need it, don't worry and micro-optimize your CPU cycles everywhere because some school teacher taught you about O(...) notation. YOUR time is often much more valuable (i.e. you getting things done, without nasty surprises that can lead to unhappy users, security issues, or anything else) than the CPU cycles.

How often you care or even see if what your L1 cache hit ratio is when you write a typical desktop or mobile app, or any web frontend/backend? Much less often than you care about having code that just works regardless of what size a number the user (malicious or not) decided to give you.

And AGAIN, when you DO need to care, the option can be there to be explicit.

Vinay Pai • Mar 23 '18

People mindlessly repeating mantras like "premature optimization is the root of all evil" is the root of all evil.

Erebos Manannán • Mar 23 '18

I think my comment had quite a bit more content to it than that.

cvedetails.com/google-search-resul...

About 16,500 results

cvedetails.com/google-search-resul...

About 3,150 results

And these are just reported security issues, not bugs caused by choosing the wrong integer size.

Here's a new quote, it's quoting me saying it just here: "People quoting O(...) notation and talking about L1 cache as if any of it mattered at all for most cases are the root of all evil" ;)

Vinay Pai • Mar 23 '18

Okay let's say you replaced them with arbitrary precision arithmetic. How many new bugs would be caused there by malicious input causing huge memory allocations and blowing up the server?

Erebos Manannán • Mar 23 '18

Quick estimate: probably fewer. For one it'd be easier to do an if (length > MAX_LENGTH) -type check.

Also if you use user input to determine how much memory you allocate you're probably doing something wrong anyway, regardless of what kind of arithmetic you're doing. Take a file upload, do you trust on the client to tell you "I'm sending you a file that is 200kB in size, and here it comes" and then trust the client, or do you just take in an arbitrary file stream and then if it's too big just say "ok enough" at some point and disconnect?

Anyway I tire of this mindless banter. I've made my point.

edA‑qa mort‑ora‑y • Mar 23 '18

A few notes, related to Leaf, for this discussion:

I intend on doing over/underflow checks by default (unless turned off for optimization). Thus an overflow will result in an error.
I will provide logical ranges for values, like integer range(0,1000) so you can give real world limits to numbers and let an appropriate type be picked.
Arbitrary precision is extremely costly compared to native precision. A fixed, but very high, precision, is not as costly, but doesn't solve anything. On that note, you can do integer 1024bit in Leaf if you want.
Leaf constants are arbitrary rationals and high precision floating points during compilation. Conversions that lose precision (like float -> integer) are also disallowed. This helps in several situations.

Vinay Pai • Mar 23 '18

So you pointed to a bunch of bugs caused by a lack of range checks. Your solution to avoid creating another bug is to... add a range check. Brilliant! You have indeed made your point.