DEV Community

Discussion on: Perl 7: A Modest Proposal

 
grinnz profile image
Dan Book • Edited

I don't see how use utf8 is analogous to auto-encoding on STDOUT - the opposite is of course auto-decoding from STDIN. use utf8 is instead a lexical declaration of how the source code shall be interpreted.

The problem with -CS and any other application of layers to STDIN/STDOUT/STDERR is that the handles and any layers applied to them are global. So for example, it will cause Mojo::Log's encoded output to STDERR to be double-encoded. (This experiment was attempted in Perl 5.8.0 and failed miserably.)

Thread Thread
 
fgasper profile image
Felipe Gasper

If there were a variant of use utf8 that didn’t auto-decode strings in the source, I’d be much less concerned. But the issue I see with defaulting use utf8 to on is that it would break any code like this:

perl -e'print "épée"'
Enter fullscreen mode Exit fullscreen mode

In fact, it’ll even break things like this:

my $text = utf8_decode("épée");
_send_to_dbus($text);
Enter fullscreen mode Exit fullscreen mode

Ostensibly the goal of Perl 7 would be to define a set of defaults that only break “undesirable” practices. Changing the value of hard-coded strings in the source code seems likely to break a lot of things and thus deter people from using the new set of defaults.

Thread Thread
 
grinnz profile image
Dan Book

"A variant of use utf8 that didn't auto-decode strings in the source" would be a no-op - that is the only thing use utf8 does.

I appreciate your opinion though I believe it would be more helpful to new code than harmful. The purpose of use v7 is of course not to blindly apply to existing code - as proposed, it will also break any code defining subroutine prototypes, for example.

Thread Thread
 
fgasper profile image
Felipe Gasper

Prototypes have been “gently discouraged” for some time, though, AFAIK. More so, I think, than writing new Perl without use utf8.

use utf8 seems the most disruptive of the changes you propose—disruptive insofar as that developers themselves would need to exercise especial care when writing new code or porting existing code. use v7 defined with use utf8 would be problematic where I work, for example, where strings are understood by default to be undecoded/binary/encoded. Whereas enabling strict/warnings/signatures will generate “loud”, easily-fixed breakages, breakages from auto-decode of strings in source seem likely to be subtler.

Anyhow … the appreciation of opinions is mutual. :) We’ll see what comes. Thanks!