Perl 7 Feature Request: sealed subs for typed lexicals

[VERIFIED] Last updated by Joe Schaefer on Tue, 07 May 2024    source
 

The Problem

Perl 5’s OO runtime method lookup has 50% more performance overhead than a direct, named subroutine invocation.

The initial solution: Doug MacEachern’s method lookup optimizations

Doug was the creator of the mod_perl project back in the mid-90s, so obviously writing high performance Perl was his forté. One of his many contributions to p5p was to cut the performance penalty of OO method lookup overhead in half, by using a method + @ISA heirarchy cache to make the runtime object method lookup for mod_perl objects like Apache2::RequestRec as streamlined as possible. But it only gets us half-way there.

This isn’t a trifling issue with calls to C struct get-set accessor methods — the common situation with many mod_perl APIs. Perl’s runtime method-call lookup penalty on httpd’s struct request_rec *, that mod_perl exposes via the Apache2::RequestRec module, is on the same order of magnitude of the full execution of the call. For mod_perl backed sites making millions of XS method calls a second, this is an awful waste of precious CPU cycles.

What Doug was looking for was a way to tell Perl 5 to perform the method lookup at compile time, the way it does with named subroutine calls. Every time Doug tried, he hit roadblocks of either a social or technical nature. Perhaps it’s time to make another pass at this idea with the advent of Perl 7.

Benchmark script

#!/usr/bin/env -S perl -Ilib -Iblib/arch
use Test::More tests => 3;
use POSIX 'dup2';
dup2 fileno(STDERR), fileno(STDOUT);
use strict;
use warnings;
use Benchmark ':all';
our ($x, $z);
$x = bless {}, "Foo";
$z = Foo->can("foo");
sub method {$x->foo}
sub class  {Foo->foo}
sub anon   {$z->($x)}
sub bar    { 1 }
sub reentrant;
BEGIN {
  package Foo;
  use base 'sealed';
  use sealed 'deparse';
  sub foo { shift }
  my $n;
  sub _foo :Sealed { my main $x = shift; $n++ ? $x->bar : $x->reentrant }
}
sub func   {Foo::foo($x)}

BEGIN {our @ISA=qw/Foo/}
use sealed 'deparse';

my main $y; #sealed src filter transforms this into: my main $y = 'main';

sub sealed :Sealed {
    $y->foo();
}

sub also_sealed :Sealed {
    my main $a = shift;
    if ($a) {
        my Benchmark $bench;
        my $inner = $a;
        return sub :Sealed {
            my Foo $b = $a;
            $inner->foo($b->foo($inner->bar, $inner, $bench->cmpthese));
            $a = $inner;
            $a->foo;
            $b->bar; # error!
          };
    }
    $a->bar();
}

sub reentrant :Sealed { my main $b = shift; local our @Q=1; my $c = $b->_foo }

ok($y->reentrant()==1);

my %tests = (
    func => \&func,
    method => \&method,
    sealed => \&sealed,
    class => \&class,
    anon => \&anon,
);

cmpthese 20_000_000, \%tests;

ok(1);

use constant LOOPS => 3;

sub method2 {
  my $obj = "main";
  for (1..LOOPS) {
    $obj->foo;
    $obj->bar;
    $obj->reentrant;
  }
}

sub sealed2 :Sealed {
  my main $obj; # sealed-src-filter
  for (1..LOOPS) {
    $obj->foo;
    $obj->bar;
    $obj->reentrant;
  }
}

cmpthese 1_000_000, {
  method => \&method2,
  sealed => \&sealed2,
};

ok(1);

Benchmark results

1..3
sealed: compiling main->reentrant lookup.
sealed: compiling main->bar lookup.
sub _foo :sealed {
    package Foo;
    use warnings;
    use strict;
    my main $x = shift();
    $n++ ? $x->bar:compiled : $x->reentrant:compiled;
}
sealed: compiling main->foo lookup.
sub sealed :sealed {
    use warnings;
    use strict;
    $y->foo:compiled;
}
sealed: compiling Benchmark->cmpthese lookup.
sealed: compiling Foo->foo lookup.
sealed: compiling main->foo lookup.
sealed: compiling Foo->bar lookup.
sealed: tweak() aborted: sealed: invalid lookup: Foo->bar - did you forget to 'use Foo' first?
sub __ANON__ :sealed {
    use warnings;
    use strict;
    my Foo $b = $a;
    $inner->foo($b->foo:compiled($inner->bar, $inner, $bench->cmpthese:compiled));
    $a = $inner;
    $a->foo:compiled;
    $b->bar;
}
sealed: compiling main->bar lookup.
sub also_sealed :sealed {
    use warnings;
    use strict;
    my main $a = shift();
    if ($a) {
        my Benchmark $bench = 'Benchmark';
        my $inner = $a;
        return sub {
            my Foo $b = $a;
            $inner->foo($b->foo:compiled($inner->bar, $inner, $bench->cmpthese:compiled));
            $a = $inner;
            $a->foo:compiled;
            $b->bar;
        }
        ;
    }
    $a->bar:compiled;
}
sealed: compiling main->_foo lookup.
sub reentrant :sealed {
    use warnings;
    use strict;
    my main $b = shift();
    (local our(@Q)) = 1;
    my $c = $b->_foo:compiled;
}
sealed: compiling main->foo lookup.
sealed: compiling main->bar lookup.
sealed: compiling main->reentrant lookup.
sub sealed2 :sealed {
    use warnings;
    use strict;
    my main $obj = 'main';
    foreach $_ (1 .. 3) {
        $obj->foo:compiled;
        $obj->bar:compiled;
        $obj->reentrant:compiled;
    }
}
ok 1
             Rate  class method   anon   func sealed
class  16129032/s     --    -4%   -26%   -33%   -36%
method 16806723/s     4%     --   -23%   -30%   -34%
anon   21739130/s    35%    29%     --   -10%   -14%
func   24096386/s    49%    43%    11%     --    -5%
sealed 25316456/s    57%    51%    16%     5%     --
ok 2
           Rate method sealed
method 546448/s     --   -17%
sealed 662252/s    21%     --
ok 3

Proposed Perl 7 solution: :sealed subroutines for typed lexicals

Sample code:

use v7.0;
use Apache2::RequestRec;

sub handler :sealed {
  my Apache2::RequestRec $r = shift;
  $r->content_type("text/html"); #compile time method lookup
}

Production-Quality, Robust Perl v5.28+ Prototype: sealed.pm v7.0.7 (on CPAN)

Compile instructions for perl 5.30+ are available in the sealed.pm pod should you want to run mod_perl2 w/ ithreads and httpd-2.4 w/ event mpm, and not segfault at any scale. Tested on Solaris 11.4 and Ubuntu 22.04 on amd64.

For fun, try this monkey patch to ModPerl::RegistryCooker:

<VirtualHost *:443>
    PerlModule ModPerl::RegistryCookerSealed
    PerlResponseHandler ModPerl::Registry
    AddHandler perl-script .pl
    Options +ExecCGI
</VirtualHost>

It enables the effects of sub handler :Sealed {script goes here} on all of your ModPerl::Registry scripts, something like this one.

~/src/cms% h2load -n 100000 -c 1000 -m 100 -t 10 http://localhost/perl-script/enquiry.pl\?lang=.es
starting benchmark...
spawning thread #0: 100 total client(s). 10000 total requests
spawning thread #1: 100 total client(s). 10000 total requests
spawning thread #2: 100 total client(s). 10000 total requests
spawning thread #3: 100 total client(s). 10000 total requests
spawning thread #4: 100 total client(s). 10000 total requests
spawning thread #5: 100 total client(s). 10000 total requests
spawning thread #6: 100 total client(s). 10000 total requests
spawning thread #7: 100 total client(s). 10000 total requests
spawning thread #8: 100 total client(s). 10000 total requests
spawning thread #9: 100 total client(s). 10000 total requests
Application protocol: h2c
progress: 10% done
progress: 20% done
progress: 30% done
progress: 40% done
progress: 50% done
progress: 60% done
progress: 70% done
progress: 80% done
progress: 90% done
progress: 100% done

finished in 13.07s, 7652.14 req/s, 11.83MB/s
requests: 100000 total, 100000 started, 100000 done, 100000 succeeded, 0 failed, 0 errored, 0 timeout
status codes: 100000 2xx, 0 3xx, 0 4xx, 0 5xx
traffic: 154.61MB (162119955) total, 566.39KB (579980) headers (space savings 95.47%), 152.30MB (159700000) data
                       min         max         mean         sd        +/- sd
time for request:     5.74ms      12.77s       6.39s       3.61s    58.14%
time for connect:      304us    293.01ms     70.17ms     76.83ms    74.80%
time to 1st byte:     7.86ms       7.87s       3.33s       1.82s    50.40%
req/s           :       7.71      248.17       19.60       28.07    92.70%

See https://github.com/SunStarSys/sealed/blob/master/lib/sealed.pm. Look for t/bench.pl in the parent directory.

This will allow Perl 5 to do the sample code’s content_type method-lookup at compile time, without causing any back-compat issues or aggrieved CPAN coders, since this feature would target application developers. Not inheritable OO-module authors.

This Perlish idea is gratuitously stolen from Dylan. Read this for the CPython effort from over a decade ago.