Paul "LeoNerd" Evans
2024-11-11 15:14:58 UTC
I've been looking at implementing my new PPC0024 (the one about named
parameters to subs). The overall approach I'm likely to take is to
basically copy the existing implementation from XS::Parse::Sublike.
This would involve creating one big op that handles all of the named
arguments and the trailing slurpy in one go.
Keeping in mind the overall thoughts of making OP_ENTERSUB faster by
not populating @_ for subs using signatures, it seems it'd be more
effort to do named first then no-snails, so I want to attack them in
the other order. Create a better signatures implementation for
no-snails, and then extend it for named parameters afterwards.
Towards this goal I currently have a branch that adds a new
OP_MULTIPARAM, a large macro-style op of the same nature as
OP_MULTIDEREF and OP_MULTICONCAT. The idea is that the optimiser would
recognise certain optree shapes of the current OP_ARG* ops and rewrite
them into a single OP_MULTIPARAM with a list of steps inside it, that
tells it how to work. It will handle all the existing positional and
slurpy arguments (and eventually the named ones too).
My current branch is sitting over here:
https://github.com/leonerd/perl5/tree/faster-signatures
((Note that currently in that branch it's conditionally enabled by a
feature, but that is purely a *temporary* development hack so I can
selectively apply it just when I want to test it. That won't be part
of the final implementation, so ignore that for review purposes.))
I don't feel quite confident enough yet in my design to say "yes do
it", without having answers to a few more questions.
1) In order to detect presence or absence of a passed argument value
from the caller for optional positional parameters, the existing
OP_ARG* ops just compare the size of the defav. This is simple and
works and would translate across to a no-snails approach for
positionals by just comparing the position of the stack pointers.
But when we have named parameters as well, that would no longer
work. This problem is explained more in a comment in
XS::Parse::Sublike along with its solution to it - by using the
SVf_PADSTALE flag.
https://metacpan.org/release/PEVANS/XS-Parse-Sublike-0.30/source/src/parse_subsignature_ex.c#L134
I could do exactly the same thing in the "real" core
implementation of OP_MULTIPARAM for positional/named params. Is
this good enough?
2) The whole thing is done as an optimiser rewrite step, by letting
the parser build up the existing OP_ARG* style of ops, and then
rewriting the optree into using OP_MULTIPARAM instead. Is this the
best way to go about it? Should it be done a different way? E.g.
should the parser directly emit OP_MULTIPARAM itself?
When it gets to named parameters, there aren't OP_ARG* equivalents
that the parser could emit and have rewritten into OP_MULTIPARAM.
The trouble there is that, while the optree for positional
arguments can be build up incrementally, not so for named ones.
There's a single op for all the named params at once, which has to
be piecewise constructed as you parse the declaration syntax for
it. So this rewrite shape would seem to be in the way of that.
The way that XS::Parse::Sublike goes about this is to have its own
large macro-op for handling all the named params, plus the slurpy,
and it builds that op directly.
3) How to handle the special `$self` argument in `method` subs?
So far, all of the above work has been concerned entirely with
subs which begin with just OP_ARG* ops (with scattered
OP_NEXTSTATEs which I can basically ignore). This works fine, but
falls apart for any `method` of `use feature 'class'`, because
those methods start with an extra early OP_METHSTART op. The job
of that methstart is to shift the $self off the arguments list
early, and set up the field bindings so that uses of field
variables will work (even by param defaulting expressions).
Because of this early op, it currently means that none of my
OP_MULTIPARAM optimisations will touch the sub, because it has
something else first that it doesn't understand. I could add a
bunch of special-purpose handling in to handle that one specific
case, and it would work for core methods, but then it wouldn't
work for the same `method` syntax provided by the CPAN module
Object::Pad, which does it by what appears to core perl as just
some custom op. It wouldn't be able to recognise that same thing.
So currently here, I am a little bit stuck for what to do about
that. It would be nice if all the faster-signatures work that
enables that performance boost, and permits named parameters, can
still work nicely with this CPAN module. I just don't currently
have an idea on how to achieve it.
Thoughts welcome on all the above...
parameters to subs). The overall approach I'm likely to take is to
basically copy the existing implementation from XS::Parse::Sublike.
This would involve creating one big op that handles all of the named
arguments and the trailing slurpy in one go.
Keeping in mind the overall thoughts of making OP_ENTERSUB faster by
not populating @_ for subs using signatures, it seems it'd be more
effort to do named first then no-snails, so I want to attack them in
the other order. Create a better signatures implementation for
no-snails, and then extend it for named parameters afterwards.
Towards this goal I currently have a branch that adds a new
OP_MULTIPARAM, a large macro-style op of the same nature as
OP_MULTIDEREF and OP_MULTICONCAT. The idea is that the optimiser would
recognise certain optree shapes of the current OP_ARG* ops and rewrite
them into a single OP_MULTIPARAM with a list of steps inside it, that
tells it how to work. It will handle all the existing positional and
slurpy arguments (and eventually the named ones too).
My current branch is sitting over here:
https://github.com/leonerd/perl5/tree/faster-signatures
((Note that currently in that branch it's conditionally enabled by a
feature, but that is purely a *temporary* development hack so I can
selectively apply it just when I want to test it. That won't be part
of the final implementation, so ignore that for review purposes.))
I don't feel quite confident enough yet in my design to say "yes do
it", without having answers to a few more questions.
1) In order to detect presence or absence of a passed argument value
from the caller for optional positional parameters, the existing
OP_ARG* ops just compare the size of the defav. This is simple and
works and would translate across to a no-snails approach for
positionals by just comparing the position of the stack pointers.
But when we have named parameters as well, that would no longer
work. This problem is explained more in a comment in
XS::Parse::Sublike along with its solution to it - by using the
SVf_PADSTALE flag.
https://metacpan.org/release/PEVANS/XS-Parse-Sublike-0.30/source/src/parse_subsignature_ex.c#L134
I could do exactly the same thing in the "real" core
implementation of OP_MULTIPARAM for positional/named params. Is
this good enough?
2) The whole thing is done as an optimiser rewrite step, by letting
the parser build up the existing OP_ARG* style of ops, and then
rewriting the optree into using OP_MULTIPARAM instead. Is this the
best way to go about it? Should it be done a different way? E.g.
should the parser directly emit OP_MULTIPARAM itself?
When it gets to named parameters, there aren't OP_ARG* equivalents
that the parser could emit and have rewritten into OP_MULTIPARAM.
The trouble there is that, while the optree for positional
arguments can be build up incrementally, not so for named ones.
There's a single op for all the named params at once, which has to
be piecewise constructed as you parse the declaration syntax for
it. So this rewrite shape would seem to be in the way of that.
The way that XS::Parse::Sublike goes about this is to have its own
large macro-op for handling all the named params, plus the slurpy,
and it builds that op directly.
3) How to handle the special `$self` argument in `method` subs?
So far, all of the above work has been concerned entirely with
subs which begin with just OP_ARG* ops (with scattered
OP_NEXTSTATEs which I can basically ignore). This works fine, but
falls apart for any `method` of `use feature 'class'`, because
those methods start with an extra early OP_METHSTART op. The job
of that methstart is to shift the $self off the arguments list
early, and set up the field bindings so that uses of field
variables will work (even by param defaulting expressions).
Because of this early op, it currently means that none of my
OP_MULTIPARAM optimisations will touch the sub, because it has
something else first that it doesn't understand. I could add a
bunch of special-purpose handling in to handle that one specific
case, and it would work for core methods, but then it wouldn't
work for the same `method` syntax provided by the CPAN module
Object::Pad, which does it by what appears to core perl as just
some custom op. It wouldn't be able to recognise that same thing.
So currently here, I am a little bit stuck for what to do about
that. It would be nice if all the faster-signatures work that
enables that performance boost, and permits named parameters, can
still work nicely with this CPAN module. I just don't currently
have an idea on how to achieve it.
Thoughts welcome on all the above...
--
Paul "LeoNerd" Evans
***@leonerd.org.uk
http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS
Paul "LeoNerd" Evans
***@leonerd.org.uk
http://www.leonerd.org.uk/ | https://metacpan.org/author/PEVANS