John Napiorkowski

Posted on Aug 25, 2023

Benchmarking Perl Core Class in v5.38

#perl

UPDATE

Gist to my test script so you can tell me what I did wrong: https://gist.github.com/jjn1056/6a34517ec1184d0fd7190b090eec1414

Introduction

One of the biggest upsides to the idea of having improved support for objects and classes in core Perl, as opposed to one of a grab bag of options on CPAN, is the idea that since this code would be written at the C level it should be faster than options such as Moose, which is written in Perl. There are of course big upsides to having it written in Perl, mostly the fact that more programmers can contribute to it which results in something more reflective of broad community needs. Nevertheless if we could vastly reduce the cost of creating classes in Perl that could be a huge win for certain types of projects. So I created a very simple test case to compare core class with two of the most popular object systems on CPAN: Moose and Moo.

The Test Case

Since core class has very few features compared to Moo/se I choose a very simple test. The time to create a simple class with three attributes (or 'fields' as they are named now in core class) that are marked to be initialized via the 'new' constructor and then to access each of those fields. Here's what that looks like in Moose:

package MyClass::Moose;
use Moose;

has 'attribute1' => (is => 'ro');
has 'attribute2' => (is => 'ro');
has 'attribute3' => (is => 'ro');

MyClass::Moose->meta->make_immutable;

And in Moo:

package MyClass::Moo;
use Moo;

has 'attribute1' => (is => 'ro');
has 'attribute2' => (is => 'ro');
has 'attribute3' => (is => 'ro');

And finally in core class:

class MyClass::CoreClass {
  field $attribute1 :param;
  field $attribute2 :param;
  field $attribute3 :param;

  method attribute1() { $attribute1 }
  method attribute2() { $attribute2 }
  method attribute3() { $attribute3 }
}

If anyone is more learned in core class and would like to suggest a better approach please let me know. Core class has to be more verbose that Moo/se since it doesn't support creating accessors for your fields (AFAIK).

And here's the initial test case:

sub test_case1 {
  my $obj = MyClass::Moose->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);
  my $attribute1 = $obj->attribute1;
  my $attribute2 = $obj->attribute2;
  my $attribute3 = $obj->attribute3;
}

sub test_case2 {
  my $obj = MyClass::Moo->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);
  my $attribute1 = $obj->attribute1;
  my $attribute2 = $obj->attribute2;
  my $attribute3 = $obj->attribute3;
}

sub test_case3 {
  my $obj = MyClass::CoreClass->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);
  my $attribute1 = $obj->attribute1;
  my $attribute2 = $obj->attribute2;
  my $attribute3 = $obj->attribute3;
}

I used the standard Perl Benchmark class:

use Benchmark qw(:all);
timethis(2000000, \&test_case1, "Moose: Create and access");
timethis(2000000, \&test_case2, "Moo: Create and access");
timethis(2000000, \&test_case3, "Core: Create and access");

Here's the results:

Moose: Create and access:  6 wallclock secs ( 5.55 usr +  0.02 sys =  5.57 CPU) @ 359066.43/s (n=2000000)
Moo: Create and access:  4 wallclock secs ( 3.54 usr +  0.00 sys =  3.54 CPU) @ 564971.75/s (n=2000000)
Core: Create and access:  4 wallclock secs ( 4.07 usr +  0.00 sys =  4.07 CPU) @ 491400.49/s (n=2000000)

(I did this on my intel Mac running Perl 5.38)

Ok so... I was not expecting core class to lose out to Moo. And honestly Moose, which is widely thought of as the fat option, is not actually that much worse. Basically Moose is 35% slower than Moo and Moo is anywhere from 7-15% faster than core class after several runs. This is surprising to me. I would have expected core class, which has lower overall overhead since it has far fewer features than Moo/se AND is written in C to have utter blown the doors off its competition. So I stepped back and decided to eliminate method dispatch from the benchmarks. Basically just test object creation. So I wrote three new tests:

sub test_case4 {
  my $obj = MyClass::Moose->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);
}

sub test_case5 {
  my $obj = MyClass::Moo->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);
}

sub test_case6 {
  my $obj = MyClass::CoreClass->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);
}

and added these to the benchmarks:

timethis(2000000, \&test_case4, "Moose: Create object");
timethis(2000000, \&test_case5, "Moo: Create object");
timethis(2000000, \&test_case6, "Core: Create object");

And got this:

Moose: Create object:  5 wallclock secs ( 3.91 usr +  0.01 sys =  3.92 CPU) @ 510204.08/s (n=2000000)
Moo: Create object:  3 wallclock secs ( 2.71 usr +  0.00 sys =  2.71 CPU) @ 738007.38/s (n=2000000)
Core: Create object:  3 wallclock secs ( 2.72 usr +  0.00 sys =  2.72 CPU) @ 735294.12/s (n=2000000)

Again Moose is the loser by a solid 45% (average of running this test script 10 times with pauses in between to let my computer cool off.). But again surprisingly Moo and Core Class are basically tied, even though Moo is basically Pure Perl and runs on Perl versions as far back as 5.6 (!!!) while supporting more features such as roles, type constraints and is widely compatible with many Perl object systems including the plain old bless system which has been around since the previous century. And Moose honestly isn't that much worse considering the vast overhead of its meta object protocol. I was completely expecting C/XS based Perl core class to be a magnitude faster or better in these benches. It's basically a wash.

Conclusion: Core class isn't faster

So, honestly I find this disappointing. We're being asked to give up a lot with core class and I was expecting to find some silver lining in term of performance. Especially as I was directly told one upside to using the new internal private data system was that it would be faster than blessed hashes. It's clearly not. Please feel free to offer additional tests or explanations of errors in my approach.

What about memory usage?

In theory core class written in C and using an optimized storage instead of blessed hash refs could improve memory usage. While not the most important metric for a scripting language it's worth a look. I used Devel::Size for this. Again feel free to correct me if I'm doing it wrong. Here's the examples:

use Devel::Size 'total_size';

{
  my $obj = MyClass::Moose->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);

  my $size = total_size($obj);
  print "Moose size: $size bytes\n";
}

{
  my $obj = MyClass::Moo->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);

  my $size = total_size($obj);
  print "Moo size: $size bytes\n";
}

{
  my $obj = MyClass::CoreClass->new(
    attribute1 => 'hello',
    attribute2 => 42, 
    attribute3 => 1);

  my $size = total_size($obj);
  print "Core size: $size bytes\n";
}

And the output:

Moose size: 420 bytes
Moo size: 420 bytes
Core size: 107 bytes

I'm actually surprised that the Moo size and Moose size are the same, I did expect Moose to be much fatter. It's possible Devel::Size is not finding the Moose Meta class or something similar. But here we do have a solid win for core class. 4x less memory usage is totally meaningful and could have a major impact for todays workloads which often run on virtualized containers and where memory usage is important. I'll be interested in seeing how this changes when or if core class plays feature catchup with Moose.

What about Plain Old Bless?

It's not a fair comparison because if you want to party like it's 1999 and just roll something directly over bless you lose tons of features and safety. But for kicks let's take a look. Here's the class:

package MyClass::Bless;

sub new {
  my $class = shift;
  return bless \%args, $class
}

sub attribute1 { return shift->{attribute1} }
sub attribute2 { return shift->{attribute2} }
sub attribute3 { return shift->{attribute3} }

And the test results:

Bless: Create and access:  1 wallclock secs ( 0.46 usr +  0.00 sys =  0.46 CPU) @ 4347826.09/s (n=2000000)
Bless: Create object:  1 wallclock secs ( 0.48 usr +  0.00 sys =  0.48 CPU) @ 4166666.67/s (n=2000000)
Bless size: 120 bytes

So looks like core class still wins on overall object size (bless is 11-12% larger) but the difference is much less a victory compared to Moo/se. And wow look at the speed of object creation. Old school bless is 5-6x faster than both Moo and core class. So looks like right now when you need the lightest possible objects and are able to do without the handrails Moo/se offer its worth looking at old school bless. This could matter in the case where you are making zillions of objects for something. Its a good benchmark for core class to shoot at.

One last thing

I don't hate the idea of core class. I love it. I want it to be successful. I'm very encouraged to see a solid improvement in the memory usage metric. I just want it to be better than what we already have and to reflect the needs of what programmers are actually doing.

Top comments (11)

Toby Inkster • Aug 25 '23 • Edited

Ok so... I was not expecting core class to lose out to Moo.

You're largely testing the speed of the constructor in these tests, because it's a much slower process that reader accessor calls, so any differences in speed of the readers is going to be dwarfed by differences in speed of the constructors. And because there's only one object constructed by each iteration of the test, a lot is also wasted on the overhead of the benchmarking process.

Try tests like:

sub test_case1 {
  for my $i ( 1 .. 5000 ) {
    my $obj = MyClass::Moose->new(
      attribute1 => 'hello',
      attribute2 => 42, 
      attribute3 => $i,
    );
    # Test attribute reads more times than constructor,
    # on the grounds that once you've constructed an
    # object, you'll usually access it many times.
    my $dummy;
    $dummy = $obj->attribute1 for 1 .. 10;
    $dummy = $obj->attribute2 for 1 .. 10;
    $dummy = $obj->attribute3 for 1 .. 10;
  }
}

And obviously the same for the Moo and Core implementations.

I'm actually surprised that the Moo size and Moose size are the same, I did expect Moose to be much fatter.

Moo and Moose objects are both just blessed hashrefs, and you've used the same hash keys (attribute names) and same values, so of course they're going to be the same sized objects. Where they'll differ is the overall size of the thread/process.

Another thing you might want to take into consideration is the presence or absence of the module Class::XSAccessor. Moo will use it to build XS accessors when possible, and this can have a major impact on benchmarking results. I'm not by any means suggesting that having it installed is "cheating", but not mentioning whether it's installed is leaving out some pretty important information.

John Napiorkowski • Aug 25 '23

I have Class::XSA installed, no doubt if we ever add accessors to core class we can test that again

Paul Evans • Sep 4 '23 • Edited

As others have pointed out: You're largely just benchmarking the constructor here. Core class constructors do a lot of work, and also they're very much not-optimised yet. They're the absolute simplest implementation I could get away with technically working.

One thing you've not benchmarked at all here is how fast it is to access or modify fields from within methods of the actual class. These should be much faster than any of the method-based accessor models can provide, because they won't involve a method call just to read or write a field. Any real-world code is likely to contain large methods that have lots of field accesses. In those cases I would imagine things to run a lot faster.

I'll readily accept that if all you're doing with your objects is using them as "dumb structs" accessed entirely from outside with no interesting internal methods, then there's really not much if anything to be gained by using core classes in this manner. But then I already wrote a better way of handling those kinds of things anyway:

metacpan.org/pod/Struct::Dumb

Edit: I've also added a comment on another thread on the same subject; some more detail to be found here: dev.to/leonerd/comment/294n3

John Napiorkowski • Sep 5 '23

The way fields work is core class is very interesting and I wish it could have been busted out and not come with all the baggage that the rest of core class has. In any case code would need to be totally rewritten to work as you are suggesting in which case my boss is going to say "we need to re-write the code? Great let's move away from Perl".

Paul, the point of my post is I'm trying to see an upside to core class and just not seeing it. You keep asking for feedback, this is what feedback looks like.

Paul Evans • Sep 6 '23

You're trying to treat core's feature class as a fully-finished thing. It's not. It's an early implementation that I haven't spent more than maybe 20 minutes trying to do any optimization for at all. Getting it to work fast has not been at all my focus up tlil now - I've been trying to get it to exist at all. The fact that it performs in the same ballpark as the rest of the systems is already something close to a miracle.

If anyone more than just me was working on it in more than just bits and pieces of spare time, we'd no doubt have an even snappier, faster, more featured-thing by now. In fact, based I suspect in part on this very discussion, a couple of folks have already begun looking into it to do just that. Perhaps soon we'll have an updated version that performs even better against those many systems that have had a lot of folks looking into them for years already by now.

John Napiorkowski • Sep 7 '23

I'd love for you to have more help on this because I suspect the main reason we're delivering something that is displeasing to many is that it has too few contributors. One of the main findings of my post IMHO is that maybe not everything needs rot be written in XS. The fact its XS is a HUGE barrier to contribution. I had assumed doing this all in XS was needed in order to get big performance and memory wins. Of course some of this needs to be at the XS/C level but maybe not all of it. Can we figure out together how to make it more possible for part of this to move to Perl land such that we can get more contributors? Maybe for example you could do more work on a MOP with an API that is exposed to Perl? I'm happy to work with you on that if we can do it.

M Conrad • Sep 5 '23

Any real-world code is likely to contain large methods that have lots of field accesses. In those cases I would imagine things to run a lot faster.

For some code, but I hope that you see that this will never be the case for Catalyst or DBIx::Class, and those are probably the use cases that jnap is thinking about. An object system for Catalyst or DBIx::Class is all about accessor speed, and in many cases constructor speed (row objects especially). These systems will also likely never be ported to the class system in its current form because they rely heavily on multiple inheritance.

John Napiorkowski • Sep 5 '23

I think that is the issue here, I love the fields idea but in practice with tons of existing legacy code it doesn't bring much to the table. And what I'm saying is why are we doing all this at the XS level, where only a few people in the world can meaningfully contribute and where the change process has to be gate kept by the perl porters mailing list if there's no material benefit?

chrisarg • Sep 4 '23

Ha, this answered a question I asked at the FB. Now I am even less sure about what to do for an application I am designing: in a typical usage the app will be creating and destroying 10s of M of objects corresponding to database searches. I guess if the objects will never become part of an external API , one can go for blessed references and use something more secure externally? Or just do Moo ?

John Napiorkowski • Sep 5 '23

I don't think any Perl application is going to be speedy at creating 10s of millions of objects. Either rework your architecture to not need that or if you do need it you will need to look at something like Golang or Rust.

Dimitrios Kechagias • Sep 4 '23

There was another comparison of more object systems published here.

DEV Community

Benchmarking Perl Core Class in v5.38

UPDATE

Introduction

The Test Case

Conclusion: Core class isn't faster

What about memory usage?

What about Plain Old Bless?

One last thing

Top comments (11)

Read next

React-Compiler: When React becomes Svelte

Changing the Default Password on an EC2 Ubuntu Instance

I Created a Password Manager with AI: Powered by GPT-4

Error v/s Exception