Hi internals,
I would like to present a possible new RFC( "keep type of reference params" ) for your
consideration.
Firstly, an example:
<?php
function my_array_shift( array & $array ) {
$array = "string";
}
$array = [ 0, 1, 2, 3, 4 ];
my_array_shift($array);
count( $array );
The result of this code is a warning( in count line ) because of $array is a string.
However, I think it should be an error or exception when a string is assigned to $array var.
In my opinion, $array var would have to keep its type when function ends.
What is your opinion ? Do you see it useful ?
Thanks and I'm sorry for my English( I'm a Spanish ).
Regards
--
Manuel Canga
This would break quite a lot of existing code, though PHP could add an
explicit keyword like "inout" that catches this behaviour (see example in
Hack: https://docs.hhvm.com/hack/functions/inout-parameters).
Today these issues can also be caught with static analysis:
https://psalm.dev/r/1f670956ab
This would break quite a lot of existing code, though PHP could add an
explicit keyword like "inout" that catches this behaviour (see example in
Hack: https://docs.hhvm.com/hack/functions/inout-parameters).Today these issues can also be caught with static analysis:
https://psalm.dev/r/1f670956ab
I think I’d be supportive of adding inout
parameters in addition to existing support for references.
Cheers,
Ben
---- En lun, 04 may 2020 15:01:44 +0200 Matthew Brown matthewmatthew@gmail.com escribió ----
This would break quite a lot of existing code, though PHP could add an
explicit keyword like "inout" that catches this behaviour (see example in
Hack: https://docs.hhvm.com/hack/functions/inout-parameters).Today these issues can also be caught with static analysis:
https://psalm.dev/r/1f670956ab
Thanks, Matthew, I don't know about inout.
I think I’d be positive of adding inout
keyword in order to:
- Check type of out is equal to type of param( like example of my first email ).
- Avoid modifying caller var value when the function throws an exception
Other option is adding to normal reference( & ) these behaviours when strict_types will be activated in caller.
Regards
Manuel Canga
I think I’d be positive of adding
inout
keyword in order to:
- Check type of out is equal to type of param( like example of my first email ).
- Avoid modifying caller var value when the function throws an exception
Another huge advantage of adding inout parameters is that we could make
it mandatory to mark at the call-site, and get the advantages of this
RFC with less of the compatibility hassle of changing the way references
work: https://wiki.php.net/rfc/explicit_send_by_ref
We'd still need to figure out how to smoothly transition built-in
functions to use inout rather than by-ref parameters; but then we
already have magic "prefer-ref" in internal functions, so maybe we could
come up with some kind of "prefer-inout".
Regards,
--
Rowan Tommins (né Collins)
[IMSoP]
---- En lun, 04 may 2020 19:33:51 +0200 Rowan Tommins rowan.collins@gmail.com escribió ----
I think I’d be positive of adding
inout
keyword in order to:
- Check type of out is equal to type of param( like example of my first email ).
- Avoid modifying caller var value when the function throws an exception
Another huge advantage of adding inout parameters is that we could make
it mandatory to mark at the call-site, and get the advantages of this
RFC with less of the compatibility hassle of changing the way references
work: https://wiki.php.net/rfc/explicit_send_by_refWe'd still need to figure out how to smoothly transition built-in
functions to use inout rather than by-ref parameters; but then we
already have magic "prefer-ref" in internal functions, so maybe we could
come up with some kind of "prefer-inout".
Hi, Rowan. That's a good point.
If nobody objects to this RFC, could someone give me(manuelcanga user ) karma in order to create wiki page ?.
Thanks in adavance
--
Manuel Canga
What is your opinion ? Do you see it useful ?
I can see the need; I strongly dislike the idea of using references for this.
What you're describing is a form of 'out parameters' which have been
mentioned a few times before on this list.
I've made a note of the general idea here:
https://github.com/Danack/RfcCodex/blob/master/out_parameters.md
Though the wikipedia article is also good:
https://en.wikipedia.org/wiki/Parameter_(computer_programming)#Output_parameters
Using references for things like this is a bad idea as references make
it hard to reason about code. In this specific case, having the
function mutate the existing variable is horrible.
If nobody objects to this RFC
While you categorically must do what you must, I don't think you will
get much benefit from raising this as an RFC until you find someone
who can and is willing to implement it.
Actually, just finding any core contributor who is willing to say that
they would back this idea might be a good idea.
And yes, phpinternals could still really do with an 'informal but only
to open to voters' feedback mechanism. Someone should do something
about that....wait I'm someone. And I have wiki permissions.
So here's a page for a non-binding 'indication of interest' poll:
https://wiki.php.net/indication_of_interest/keep_type_of_reference_params
If anyone disagrees with the text, either please suggest a change or
edit it if you have karma.
cheers
Dan
Ack
I think changing reference is a bad idea, even if it's behind some
sort of declare or ini setting or what-have-you.
I do support inout
, which has similar high-level outcomes: the
current value is fed in, and when the function returns it has a new
value. I think we should use &
at the call site, and we should also
permit it optionally for by-ref parameters as it provides a migration
path: first add &
to all the call sites, then switch it to inout
.
---- En vie, 08 may 2020 15:31:52 +0200 Levi Morrison levi.morrison@datadoghq.com escribió ----
I think changing reference is a bad idea, even if it's behind some
sort of declare or ini setting or what-have-you.I do support
inout
, which has similar high-level outcomes: the
current value is fed in, and when the function returns it has a new
value. I think we should use&
at the call site, and we should also
permit it optionally for by-ref parameters as it provides a migration
path: first add&
to all the call sites, then switch it toinout
.
Hi, Levi,
Thanks for your opinion.
Why don't reverse mode ?.
'inout' is created with the three characteristics:
- Check type of out is equal to type of param
- Avoid modifying caller variable value when the function throws an exception. ( Like inout of Hacker language )
- Allow explicit call-site pass-by-reference annotation.
Then, current references are deprecated and after than, in other version, they are removed.
In this way, projects can be adapted gradually
Regards
Manuel Canga
---- En vie, 08 may 2020 15:07:13 +0200 Dan Ackroyd Danack@basereality.com escribió ----
What is your opinion ? Do you see it useful ?
I can see the need; I strongly dislike the idea of using references for this.
What you're describing is a form of 'out parameters' which have been
mentioned a few times before on this list.I've made a note of the general idea here:
https://github.com/Danack/RfcCodex/blob/master/out_parameters.mdThough the wikipedia article is also good:
https://en.wikipedia.org/wiki/Parameter_(computer_programming)#Output_parametersUsing references for things like this is a bad idea as references make
it hard to reason about code. In this specific case, having the
function mutate the existing variable is horrible.
Hi, Dan,
I think nothing is bad practice. It depends of context.
Example: global variables, __set/__get, public properties and so on could be considered bad practices,
however, they sometimes are useful.
References could be a priori bad idea, but if you limit your their dangerousness, they will be less bad.
If nobody objects to this RFC
While you categorically must do what you must, I don't think you will
get much benefit from raising this as an RFC until you find someone
who can and is willing to implement it.
I agree :'(
Actually, just finding any core contributor who is willing to say that
they would back this idea might be a good idea.And yes, phpinternals could still really do with an 'informal but only
to open to voters' feedback mechanism. Someone should do something
about that....wait I'm someone. And I have wiki permissions.So here's a page for a non-binding 'indication of interest' poll:
https://wiki.php.net/indication_of_interest/keep_type_of_reference_params
Thanks so much, Dan
If anyone disagrees with the text, either please suggest a change or
edit it if you have karma.
I'd add( but I don't have karma ):
Add a newinout
keyword( very similar to 'inout' of Hack ) in order to:
- Check type of out is equal to type of param( like example of my first email ).
- Avoid modifying caller variable value when the function throws an exception. ( Thanks to Mathew for link to inout* of Hacker language )
- Allow explicit call-site pass-by-reference annotation. ( Thanks to Rowands for link** to Nikita's RFC )
- https://docs.hhvm.com/hack/functions/inout-parameters
** https://wiki.php.net/rfc/explicit_send_by_ref
cheers
Dan
Ack
Thanks again, Dan,
Regards
Manuel Canga
---- En lun, 04 may 2020 19:33:51 +0200 Rowan Tommins <
rowan.collins@gmail.com> escribió ----I think I’d be positive of adding
inout
keyword in order to:
- Check type of out is equal to type of param( like example of my
first email ).- Avoid modifying caller var value when the function throws an
exceptionAnother huge advantage of adding inout parameters is that we could make
it mandatory to mark at the call-site, and get the advantages of this
RFC with less of the compatibility hassle of changing the way
references
work: https://wiki.php.net/rfc/explicit_send_by_refWe'd still need to figure out how to smoothly transition built-in
functions to use inout rather than by-ref parameters; but then we
already have magic "prefer-ref" in internal functions, so maybe we
could
come up with some kind of "prefer-inout".Hi, Rowan. That's a good point.
If nobody objects to this RFC, could someone give me(manuelcanga user )
karma in order to create wiki page ?.
I'm finding it hard to follow what is actually being proposed here at this
point (as many different ideas seems to be discussed at the same time).
I've granted you RFC karma on the wiki in case you want to write down
something.
As other's have mentioned, this is not a simple topic from the
implementation side, so it's good to have a firm idea of how things would
work on a technical level. If you want to pursue the "inout" idea, I would
recommend reading through https://externals.io/message/101254 in its
entirety, because there is quite a bit of inout related discussion in
there. My current assessment is that I do not see any way to implement
inout in a way that both does not use references and has acceptable
performance. (Implementing inout on top of references is possible, but has
impact on its behavior, e.g. the fact that the reference will be
initialized to null by default, even if the function throws.)
One of the core problems is that any naive approach to inout (i.e.
literally implementing it as a read before the call and a write after the
call) will necessitate a copy of the modified value, precluding rc=1 cow
avoidance. Implementing array_push()
as an inout operation would copy the
array every time a value is pushed. Maybe this is actually a fundamental
implementation-independent property of inout, if it has the semantics that
the original value is not changed on exception (a copy would be necessary
in case code later in the function throws.)
Regards,
Nikita
Hi, Nikita and internals,
---- En lun, 11 may 2020 11:34:22 +0200 Nikita Popov nikita.ppv@gmail.com escribió ----
On Fri, May 8, 2020 at 8:49 AM Manuel Canga php@manuelcanga.dev wrote to this RFC, could someone give me(manuelcanga user ) karma in order to create wiki page ?.
I'm finding it hard to follow what is actually being proposed here at this point (as many different ideas seems to be discussed at the same time). I've granted you RFC karma on the wiki in case you want to write down something.
As other's have mentioned, this is not a simple topic from the implementation side, so it's good to have a firm idea of how things would work on a technical level. If you want to pursue the "inout" idea, I would recommend reading through https://externals.io/message/101254 in its entirety, because there is quite a bit of inout related discussion in there. My current assessment is that I do not see any way to implement inout in a way that both does not use references and has acceptable performance. (Implementing inout on top of references is possible, but has impact on its behavior, e.g. the fact that the reference will be initialized to null by default, even if the function throws.)
One of the core problems is that any naive approach to inout (i.e. literally implementing it as a read before the call and a write after the call) will necessitate a copy of the modified value, precluding rc=1 cow avoidance. Implementingarray_push()
as an inout operation would copy the array every time a value is pushed. Maybe this is actually a fundamental implementation-independent property of inout, if it has the semantics that the original value is not changed on exception (a copy would be necessary in case code later in the function throws.)
Thanks, Nikita. I will read that thread.
I don't have know about PHP core so it's possible I will say a silly thing but... is possible using Hack implementation ?. Maybe They found the best way of implementation.
Regards
--
Manuel Canga
Hi, Internals,
---- En lun, 11 may 2020 11:34:22 +0200 Nikita Popov nikita.ppv@gmail.com escribió ----
I'm finding it hard to follow what is actually being proposed here at this
point (as many different ideas seems to be discussed at the same time).
I've granted you RFC karma on the wiki in case you want to write down
something.As other's have mentioned, this is not a simple topic from the
implementation side, so it's good to have a firm idea of how things would
work on a technical level. If you want to pursue the "inout" idea, I would
recommend reading through https://externals.io/message/101254 in its
entirety, because there is quite a bit of inout related discussion in
there. My current assessment is that I do not see any way to implement
inout in a way that both does not use references and has acceptable
performance. (Implementing inout on top of references is possible, but has
impact on its behavior, e.g. the fact that the reference will be
initialized to null by default, even if the function throws.)One of the core problems is that any naive approach to inout (i.e.
literally implementing it as a read before the call and a write after the
call) will necessitate a copy of the modified value, precluding rc=1 cow
avoidance. Implementingarray_push()
as an inout operation would copy the
array every time a value is pushed. Maybe this is actually a fundamental
implementation-independent property of inout, if it has the semantics that
the original value is not changed on exception (a copy would be necessary
in case code later in the function throws.)Regards,
Nikita
Nikita, I've already read your thread. I like your purpose about "Explicit call-site send-by-ref syntax". I don't understand why it wasn't accepted.
Yes, you're right about performance then maybe this RFC doesn't make much sense. :(
Thanks, anyway. You can remove my karma again.
Regards
Manuel Canga
Hi internals,
I would like to present a possible new RFC( "keep type of reference
params" ) for your
consideration.
[...]
The result of this code is a warning( in count line ) because of $array is
a string.However, I think it should be an error or exception when a string is
assigned to $array var.In my opinion, $array var would have to keep its type when function ends.
What is your opinion ? Do you see it useful ?
I think everyone can agree this is useful. But the issue here is the
implementation.
Because from what I know PHP's references are a special kind of pain in the
engine.
That's why the common wisdom is to use references in PHP as least as
possible.
And IIRC what you are trying to achieve would need a major overhaul of how
references
work and someone who wanted to tackle this would have done it on their own
and
propose an RFC at the same time.
So sadly unless something semi-concrete shows up, I'm considering this a
pipe dream.
But I'd gladly be shown otherwise.
Best regards
George P. Banyard
Am 08.05.2020 um 12:37 schrieb G. P. B.:
I think everyone can agree this is useful. But the issue here is the
implementation. Because from what I know PHP's references are aspecial
kind of pain in the engine.That's why the common wisdom is to use references in PHP as least as possible.
And IIRC what you are trying to achieve would need a major overhaul of how
references work and someone who wanted to tackle this would have done it on
their own and propose an RFC at the same time.So sadly unless something semi-concrete shows up, I'm considering this a
pipe dream.
But I'd gladly be shown otherwise.
Isn't that already solved for typed properties?
Consider this:
class A { public static int $number = 5; }
$num = &A::$number;
$num = "String";
This will result in an uncaught TypeError,
see https://3v4l.org/XC6hk
I would think, it would be consistent if referenced parameters
behaved in exactly the same way.
Regards,
Thomas
On Fri, 8 May 2020 at 13:04, Thomas Gutbier thomas.gutbier@anthrotec.de
wrote:
Am 08.05.2020 um 12:37 schrieb G. P. B.:
I think everyone can agree this is useful. But the issue here is the
implementation. Because from what I know PHP's references are aspecial
kind of pain in the engine.That's why the common wisdom is to use references in PHP as least as
possible.
And IIRC what you are trying to achieve would need a major overhaul of
how
references work and someone who wanted to tackle this would have done it
on
their own and propose an RFC at the same time.So sadly unless something semi-concrete shows up, I'm considering this a
pipe dream.
But I'd gladly be shown otherwise.Isn't that already solved for typed properties?
Consider this:
class A { public static int $number = 5; }
$num = &A::$number;
$num = "String";This will result in an uncaught TypeError,
see https://3v4l.org/XC6hkI would think, it would be consistent if referenced parameters
behaved in exactly the same way.Regards,
Thomas
Indeed, I forgot that typed references properties for classes were a thing
now.
It does make this way less of a pipe dream and something achievable, but I
would imagine this not to be trivially implemented/extended to accommodate
this.
Moreover, although this is bad design IMHO, but some people may rely on this
weird feature, which brings back the whole how to handle BC question.
I for one, would be in favour of burning this into the ground but I don't
know how
others feel.
Best regards
George P. Banyard
---- En vie, 08 may 2020 13:03:59 +0200 Thomas Gutbier thomas.gutbier@anthrotec.de escribió ----
Isn't that already solved for typed properties?
Consider this:
class A { public static int $number = 5; }
$num = &A::$number;
$num = "String";This will result in an uncaught TypeError,
see https://3v4l.org/XC6hkI would think, it would be consistent if referenced parameters
behaved in exactly the same way.Regards,
Thomas
Hi, Tomas,
That option was my first thought. However, people( and me ) I know, normally use local vars for function calls.
Thanks Thomas for your opinion.
Regards
Manuel Canga
---- En vie, 08 may 2020 12:37:29 +0200 G. P. B. george.banyard@gmail.com escribió ----
I think everyone can agree this is useful. But the issue here is the implementation.
Because from what I know PHP's references are a special kind of pain in the engine.
That's why the common wisdom is to use references in PHP as least as possible.
And IIRC what you are trying to achieve would need a major overhaul of how references
work and someone who wanted to tackle this would have done it on their own and
propose an RFC at the same time.
So sadly unless something semi-concrete shows up, I'm considering this a pipe dream.
But I'd gladly be shown otherwise.
Hi, George. Thanks for your opinion.
Nikita porposed a RFC in order to improve refences( https://wiki.php.net/rfc/explicit_send_by_ref, links thanks to Rowan ).
Maybe, he wants to code this.
Regards
--
Manuel Canga,
Am 04.05.2020 um 10:53 schrieb Manuel Canga php@manuelcanga.dev:
Hi internals,
I would like to present a possible new RFC( "keep type of reference params" ) for your
consideration.
Firstly, an example:
<?php function my_array_shift( array & $array ) { $array = "string"; } $array = [ 0, 1, 2, 3, 4 ]; my_array_shift($array); count( $array );
The result of this code is a warning( in count line ) because of $array is a string.
However, I think it should be an error or exception when a string is assigned to $array var.
In my opinion, $array var would have to keep its type when function ends.
What is your opinion ? Do you see it useful ?
Thanks and I'm sorry for my English( I'm a Spanish ).
Regards
--
Manuel Canga
Hey Manuel,
the primary issue (apart from the BC break) here is leaking the reference across the function boundary.
function a(array &$a) {
$GLOBALS["globalA"] = &$a;
}
funcition b() {
$GLOBALS["globalA"] = 10;
}
$a = 1;
a($a);
b();
// $a is magically changed to 10
Yes, you can here verify, that $a is an array at the function boundaries, but you cannot afterwards.
If we had proper inout parameters (which do not leak a reference, but assign the value of the variable (in callee scope) back to the passed variable from caller), then we could easily enforce it.
But as it stands now, this is not an option. (Especially due to the false promise this seems to make.)
Bob
---- En vie, 08 may 2020 23:40:22 +0200 Bob Weinand bobwei9@hotmail.com escribió ----
Am 04.05.2020 um 10:53 schrieb Manuel Canga php@manuelcanga.dev:
Hi internals,
I would like to present a possible new RFC( "keep type of reference params" ) for your
consideration.
Firstly, an example:
<?php function my_array_shift( array & $array ) { $array = "string"; } $array = [ 0, 1, 2, 3, 4 ]; my_array_shift($array); count( $array );
The result of this code is a warning( in count line ) because of $array is a string.
However, I think it should be an error or exception when a string is assigned to $array var.
In my opinion, $array var would have to keep its type when function ends.
What is your opinion ? Do you see it useful ?
Thanks and I'm sorry for my English( I'm a Spanish ).
Regards
--
Manuel Canga
Hey Manuel,
the primary issue (apart from the BC break) here is leaking the reference across the function boundary.
function a(array &$a) {
$GLOBALS["globalA"] = &$a;
}funcition b() {
$GLOBALS["globalA"] = 10;
}$a = 1;
a($a);
b();
// $a is magically changed to 10Yes, you can here verify, that $a is an array at the function boundaries, but you cannot afterwards.
If we had proper inout parameters (which do not leak a reference, but assign the value of the variable (in callee scope) back to the passed variable from caller), then we could easily enforce it.
But as it stands now, this is not an option. (Especially due to the false promise this seems to make.)
Bob
Thanks Bob, that was a great example
I explain...I am very forgetful, so it's very easy for me to write somethink like this:
function filter_something( &string $to_filter) {
$to_filter = strtolower($to_lower);
$to_filter = sanitize_this($to_filter);
.....
}
filter_something($my_string);
Sometimes, one of these functions can returns a false|null|numeric|... instead of expected type.
Also, If any function in 'filter_somethid' would throw an exception, value in $to_filter would finish inconsistent
However, with:
function filter_something( inout string $to_filter) {
$to_filter = strtolower($to_lower);
$to_filter = sanitize_this($to_filter);
.....
}
filter_something(inout $my_string);
PHP could in the end produce exception when any function in filter_something
modify type of $my_string or
PHP could keep value of $to_filter
as well if any function throwed an exception.
Regards
Manuel Canga