Design Questions Towards PPC0024: An OP

Discussion:

Design Questions Towards PPC0024: An OP_MULTIPARAM macro-op

(too old to reply)

Paul "LeoNerd" Evans

2024-11-11 15:14:58 UTC

I've been looking at implementing my new PPC0024 (the one about named
parameters to subs). The overall approach I'm likely to take is to
basically copy the existing implementation from XS::Parse::Sublike.
This would involve creating one big op that handles all of the named
arguments and the trailing slurpy in one go.

Keeping in mind the overall thoughts of making OP_ENTERSUB faster by
not populating @_ for subs using signatures, it seems it'd be more
effort to do named first then no-snails, so I want to attack them in
the other order. Create a better signatures implementation for
no-snails, and then extend it for named parameters afterwards.

Towards this goal I currently have a branch that adds a new
OP_MULTIPARAM, a large macro-style op of the same nature as
OP_MULTIDEREF and OP_MULTICONCAT. The idea is that the optimiser would
recognise certain optree shapes of the current OP_ARG* ops and rewrite
them into a single OP_MULTIPARAM with a list of steps inside it, that
tells it how to work. It will handle all the existing positional and
slurpy arguments (and eventually the named ones too).

My current branch is sitting over here:

https://github.com/leonerd/perl5/tree/faster-signatures

((Note that currently in that branch it's conditionally enabled by a
feature, but that is purely a *temporary* development hack so I can
selectively apply it just when I want to test it. That won't be part
of the final implementation, so ignore that for review purposes.))

I don't feel quite confident enough yet in my design to say "yes do
it", without having answers to a few more questions.

1) In order to detect presence or absence of a passed argument value
from the caller for optional positional parameters, the existing
OP_ARG* ops just compare the size of the defav. This is simple and
works and would translate across to a no-snails approach for
positionals by just comparing the position of the stack pointers.
But when we have named parameters as well, that would no longer
work. This problem is explained more in a comment in
XS::Parse::Sublike along with its solution to it - by using the
SVf_PADSTALE flag.

https://metacpan.org/release/PEVANS/XS-Parse-Sublike-0.30/source/src/parse_subsignature_ex.c#L134

I could do exactly the same thing in the "real" core
implementation of OP_MULTIPARAM for positional/named params. Is
this good enough?

2) The whole thing is done as an optimiser rewrite step, by letting
the parser build up the existing OP_ARG* style of ops, and then
rewriting the optree into using OP_MULTIPARAM instead. Is this the
best way to go about it? Should it be done a different way? E.g.
should the parser directly emit OP_MULTIPARAM itself?

When it gets to named parameters, there aren't OP_ARG* equivalents
that the parser could emit and have rewritten into OP_MULTIPARAM.
The trouble there is that, while the optree for positional
arguments can be build up incrementally, not so for named ones.
There's a single op for all the named params at once, which has to
be piecewise constructed as you parse the declaration syntax for
it. So this rewrite shape would seem to be in the way of that.

The way that XS::Parse::Sublike goes about this is to have its own
large macro-op for handling all the named params, plus the slurpy,
and it builds that op directly.

3) How to handle the special `$self` argument in `method` subs?

So far, all of the above work has been concerned entirely with
subs which begin with just OP_ARG* ops (with scattered
OP_NEXTSTATEs which I can basically ignore). This works fine, but
falls apart for any `method` of `use feature 'class'`, because
those methods start with an extra early OP_METHSTART op. The job
of that methstart is to shift the $self off the arguments list
early, and set up the field bindings so that uses of field
variables will work (even by param defaulting expressions).

Because of this early op, it currently means that none of my
OP_MULTIPARAM optimisations will touch the sub, because it has
something else first that it doesn't understand. I could add a
bunch of special-purpose handling in to handle that one specific
case, and it would work for core methods, but then it wouldn't
work for the same `method` syntax provided by the CPAN module
Object::Pad, which does it by what appears to core perl as just
some custom op. It wouldn't be able to recognise that same thing.

So currently here, I am a little bit stuck for what to do about
that. It would be nice if all the faster-signatures work that
enables that performance boost, and permits named parameters, can
still work nicely with this CPAN module. I just don't currently
have an idea on how to achieve it.

Thoughts welcome on all the above...

--
Paul "LeoNerd" Evans

***@leonerd.org.uk
http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS

Paul "LeoNerd" Evans

2024-11-16 01:35:23 UTC

Permalink

On Mon, 11 Nov 2024 15:14:58 +0000

Post by Paul "LeoNerd" Evans
3) How to handle the special `$self` argument in `method` subs?

...

Post by Paul "LeoNerd" Evans
Because of this early op, it currently means that none of my
OP_MULTIPARAM optimisations will touch the sub, because it has
something else first that it doesn't understand. I could add a
bunch of special-purpose handling in to handle that one specific
case, and it would work for core methods, but then it wouldn't
work for the same `method` syntax provided by the CPAN module
Object::Pad, which does it by what appears to core perl as just
some custom op. It wouldn't be able to recognise that same thing.

Further discussion with afresh1 on IRC has lead to a possible idea on
this front.

Already custom ops get a bunch of special metadata fields, like name,
description, and most notably, a custom peephole-optimisation callback.
It would be quite easy to add another one, for the specific case of this
CV-wide OP_MULTIPARAM optimisation, where that function can decide
whether it wants to handle a no-snails rewrite of that op.

Add to `struct custom_op` a field

OP *(*xop_peep_for_nosnails)(pTHX_ OP *o);

which can yield NULL (or be absent, as it would for code unaware of
this) to indicate "I don't know about nosnails, so you'll have to
abort". If it does know about nosnails and wants to handle it, it can
return a new op fragment instead to stand in place of its original,
which then gets swapped in by the optimiser, and we can carry on.

This idea feels like it may have legs.

.. which is ironic for a snail. But there we go ;)

--
Paul "LeoNerd" Evans

***@leonerd.org.uk
http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS

Paul "LeoNerd" Evans

2024-11-16 22:11:51 UTC

Permalink

Further research today, adding some extra design notes for those
interested:

4) Both OP_MULTIDEREF and OP_MULTICONCAT follow a scheme of having a
(static) optimiser function in peep.c

S_maybe_multideref()
S_maybe_multiconcat()

whose job is to take an optree fragment that may contain an
optimisation candidate, and if so, replace it in-place with the
optimised MULTI version. In both cases, the original optree
fragments are discarded and entirely replaced with the new ones.
In both cases, modules like B::Deparse therefore have to be aware
of it.

It seems a reasonable enough pattern to continue, so I am building
OP_MULTIPARAM in a similar shape.

5) Thinking ahead to the no-snails optimisation, whereby pp_entersub
does not set up the @_ array but instead simply leaves the caller
arguments right there on the stack for pp_multiparam to consume,
there is a slight problem with all the OP_NEXTSTATE ops sitting
around.

Normally, there's a OP_NEXTSTATE between every statement, and this
includes between every pseudo-"statement" of each individual
signature param, plus one even before that all starts. Normally,
an OP_NEXTSTATE will reset the stack pointer back to the base of
its current context, and free any junk SVs that happened to be on
the stack ahead of that point. Normally this is what you want
between statements, in case some ops mess up and leave extra junk
there.

But in the specific case of the (possibly-multiple) OP_MULTIPARAM
ops that want to incrementally consume all of the caller's
argument values from the stack, this is not what we want. If we
ever let an OP_NEXTSTATE run as it normally runs, it would reset
all those args off the stack and we'd not see them.

I think therefore that OP_MULTIPARAM has to eat those and stop
them happening. It also has to null out the initial OP_NEXTSTATE
before the OP_ARGCHECK. Which does lead to a slight problem in
that any warnings that might occur from OP_MULTIPARAM will report
the wrong location; appearing to come from the caller location and
not the sub's start line. I suspect this can be fixed by storing
the OP_NEXTSTATE as a child of OP_MULTIPARAM, specifically *not*
threading it in via ->op_next, and instead manually setting
PL_curcop equal to it as part of the work done by pp_multiparam.
It's quite subtle though.

Alternative thoughts would be to define a new op flag on
OP_NEXTSTATE to ask it to not reset the stack, so the multiparam
rewrite can set those flags for it. Maybe that is better..?

--
Paul "LeoNerd" Evans

***@leonerd.org.uk
http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS