My previous two blog posts (Perl 7: A Risk-Benefit Analysis and Perl 7 By Default) explored the reasons that a Perl 7 with incompatible interpreter defaults would be a mistake. Subsequently, Perl experienced a crisis of governance authority as several core developers also expressed this view. So I will not be further discussing the idea of changing defaults in Perl major versions. But, as I had stated in conclusion:
I believe making good use of a new major version is extremely important to portraying the continued and forward development of Perl to the wider programming community. A major version with major features can be a significant boon to jumpstart the stagnating perception of Perl and bring it in line with the reality of its development.
So then, what should we do? I have some suggestions.
Stable signatures
The widely lauded signatures feature is currently still experimental to facilitate experimentation with several more important features that are needed for it to be considered feature-complete. However, at this point the basic design is well tested and stabilized, and has been unchanged for the requisite two stable releases. I propose that in Perl 7, the signatures feature be declared stable as-is, added to the :7.0
feature bundle, and these further additions to be developed as a separate initiative. The new additions could trigger distinct experimental warnings until stabilized, or be added under one or more new experimental features as appropriate. (Stabilization of the signatures feature has now been proposed by Paul Evans.)
Remove misfeatures
The indirect, multidimensional, and bareword_filehandles features being added in Perl 5.34 (well, indirect is already in Perl 5.32) are "negative" features; the behavior has existed in Perl for quite a while, but the presence of the named features allow disabling them lexically. These misfeatures are not considered best practice and lead to confusing issues, and the ability to disable them, or at least complain loudly upon encountering their use, has been available from CPAN modules for some time; a modern feature bundle should disrecommend their use. I propose removing these three features from the :7.0
feature bundle.
Apply warnings
Since v5.12
or 5.012
, the use VERSION keyword has enabled strict
alongside the appropriate feature bundle. I propose that Perl 7 finally has use VERSION
enable both the recommend strict
and warnings
pragmas when a version of 7 or higher is requested.
Apply utf8
The utf8 pragma declares that the current source file is encoded in UTF-8 rather than the native single-byte encoding (usually ISO-8859-1). (This is unrelated to functions in that namespace such as utf8::decode
as well as other UTF-8 related behavior like that provided by the open pragma.) Zefram previously proposed that this pragma be gradually made default and thus a no-op, to better match the expectations of modern programming. Along with and in anticipation of this step, I propose that Perl 7 has use v7
also do the equivalent of use utf8
.
All Together
- Declare 'signatures' feature stable as is
- Add 'signatures' feature to :7.0 feature bundle
- Remove 'indirect', 'multidimensional', and 'bareword_filehandles' negative features from :7.0 feature bundle
- Apply effect of 'use warnings' with 'use v7' or higher
- Apply effect of 'use utf8' with 'use v7' or higher
Instead of this (note: use v5.32
already includes strict
):
use v5.32;
use warnings;
use utf8;
use experimental 'signatures';
no feature qw(indirect multidimensional bareword_filehandles);
New, modern code will simply be able to write:
use v7;
And instead of this:
$ perl -Mstrict -Mwarnings -Mutf8 -Mexperimental=signatures
-M-feature=indirect,multidimensional,bareword_filehandles
-E'sub dumphex ($str) { printf "%vX", $str } dumphex "☃"'
Modern oneliners will be able to get the same effect from:
$ perl -M7 -E'sub dumphex ($str) { printf "%vX", $str } dumphex "☃"'
Looking to the Future
We must consider not only what Perl 7 should be, but what Perl should be beyond this milestone. Following from the above proposal for what Perl 7 and use v7
could entail, this naturally leads to a flexible, powerful, and considerate method of promoting stable features and a modern programming environment.
Past feature bundles have been changed rather rarely but at seemingly random versions which are difficult to recall even for the most attentive Perl hacker. Since the introduction of feature bundles in Perl 5.10, they have only changed in 5.12, 5.16, 5.24, and 5.28. Thus my final proposal is that we no longer change feature bundles in arbitrary releases, but only in (true) major versions, which provide significantly more memorable junction points and opportunity for advertisement of these important features.
There are several features currently in the experimental, design, or CPAN prototyping phase that portend to further improve the modern Perl experience, many of which Paul Evans discussed in his FOSDEM 2021 talk:
Any of these features which are not stable and ready for inclusion in a v7
feature bundle, as well as any that are built in the meantime, can be revisited for v8
. Each successive major version bundle can continue to promote the best practices of Perl features and help users evolve their usage of modern Perl with a clear, simple declaration. And most importantly, each major version including 7 can be scheduled once a sufficiently exciting new modern feature set is stable and ready to use
.
Perl 7 is dead. Long live Perl 7.
Top comments (17)
utf8
, of course, does more than merely “declare” to Perl that the source is UTF-8; it also makes Perldecode()
strings. Thus, simple one-liners like this:… will print mojibake, thusly:
If
utf8
is to be on by default, should we not preserve the functionality of such simple one-liners? That would entail making STDOUT automatically encode to UTF-8. And then if STDOUT is UTF-8, should STDIN be?My experience, FWIW, has been that
utf8
makes sense only if you care about the strings as text. I, at least, have only been in that scenario a few times. Generally when multi-byte UTF-8 characters come across code I’ve written, I don’t care about the characters themselves; I’m just doing I/O.If a user does
use v7
or-M7
under this proposal, part of what they have opted into is for their source code to be decoded to characters. Changing the behavior of the global STDIN and STDOUT handles in a reasonable way is unfortunately impossible, but you can already do that with-CS
if you accept the consequences.My oneliners often use ojo which already enables the
utf8
pragma and it's operated as expected. Data that flows from STDIN to STDOUT would be unchanged by this, though you already needed to use-CS
or appropriate decoding and encoding if you want to operate on it as text. Unfortunately there is no way around learning how and when character encoding occurs if you want to interact with text as bytes.The proposal here, though, defines what someone opts into.
All the other pieces of your proposal seem, at least from my own vantage point, to be “easy wins”. Auto-decode without an auto-encode, though, seems ripe for subtle misuse. If Perl could somehow mark the PVs as decoded, and always trigger a warning or error on output, I’d be less concerned.
Forgive my ignorance, but why would enabling
-CS
by default be any less feasible thanuse utf8
by default?I don't see how
use utf8
is analogous to auto-encoding on STDOUT - the opposite is of course auto-decoding from STDIN.use utf8
is instead a lexical declaration of how the source code shall be interpreted.The problem with
-CS
and any other application of layers to STDIN/STDOUT/STDERR is that the handles and any layers applied to them are global. So for example, it will cause Mojo::Log's encoded output to STDERR to be double-encoded. (This experiment was attempted in Perl 5.8.0 and failed miserably.)If there were a variant of
use utf8
that didn’t auto-decode strings in the source, I’d be much less concerned. But the issue I see with defaultinguse utf8
to on is that it would break any code like this:In fact, it’ll even break things like this:
Ostensibly the goal of Perl 7 would be to define a set of defaults that only break “undesirable” practices. Changing the value of hard-coded strings in the source code seems likely to break a lot of things and thus deter people from using the new set of defaults.
"A variant of
use utf8
that didn't auto-decode strings in the source" would be a no-op - that is the only thinguse utf8
does.I appreciate your opinion though I believe it would be more helpful to new code than harmful. The purpose of
use v7
is of course not to blindly apply to existing code - as proposed, it will also break any code defining subroutine prototypes, for example.Prototypes have been “gently discouraged” for some time, though, AFAIK. More so, I think, than writing new Perl without
use utf8
.use utf8
seems the most disruptive of the changes you propose—disruptive insofar as that developers themselves would need to exercise especial care when writing new code or porting existing code.use v7
defined withuse utf8
would be problematic where I work, for example, where strings are understood by default to be undecoded/binary/encoded. Whereas enabling strict/warnings/signatures will generate “loud”, easily-fixed breakages, breakages from auto-decode of strings in source seem likely to be subtler.Anyhow … the appreciation of opinions is mutual. :) We’ll see what comes. Thanks!
Agreed, those are very sensible goals.
Also, does
use utf8
slow Perl down by storing strings internally as upgraded?To get the
length()
of an upgraded string, Perl has to parse the individual characters. But thelength()
of a downgraded string is just its SvCUR.Operating on unicode is of course always slower. But only non-ascii strings are stored upgraded by
use utf8
. So the performance impact is necessary to get the correct length of such strings. (It's also cached in MG_LEN after the first access.)add async+await and we are in business :)
For one-liners, it would be good if
-E
automatically enabled all the "positive" v7 features by default.-e
would still be backwards-compatible.-E
already has the behavior of enabling the feature bundle of the current Perl version. It does not enable strict, and so I would suggest it should not enable warnings or utf8 either (as mentioned in the post,-M7
can be used to apply whateveruse v7
may end up entailing).Perl7 is a good idea. I use List::Util or List::MoreUtils all the time and was thinking perl7 could have at least some of those as built-ins. sum, min, max, uniq, zip and then some. Maybe even File::Slurp since handling file content all at once, for most files, have become a lot cheaper since perl started.
Sounds great to me!
Those are great suggestions. I hope they are all adopted for Perl 7. Thanks for the write up.
Great article, as usual.