[RFC] Transform exit() from a language construct into a standard function

1 year ago by tim@bastelstu.be — view source

unread

Hi

Let me know what you think.

The fewer not-actually-a-function functions, the better. I don't have
any remarks and I'm in favor.

Best regards
Tim Düsterhus

1 year ago by Niels Dossche — view source

unread

Hello Internals,

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

Hi Gina

Thanks for proposing this, I'm in favor of this change because this creates more consistency.

Kind regards
Niels

1 year ago by flaviohbatista@gmail.com — view source

unread

On Thu, May 9, 2024 at 7:51 AM Niels Dossche dossche.niels@gmail.com
wrote:

Hello Internals,

I would like to formally propose my idea for exit() as a function
brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that
the transformation from a "constant" to a function is done at compile time
and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

Hi Gina

Thanks for proposing this, I'm in favor of this change because this
creates more consistency.

Kind regards
Niels

Hi all,

Is there a list of other "functions" that are handled as "constants"?

--
Atenciosamente,

Flávio Heleno

1 year ago by Derick Rethans — view source

unread

On Thu, May 9, 2024 at 7:51 AM Niels Dossche dossche.niels@gmail.com
wrote:

Hello Internals,

I would like to formally propose my idea for exit() as a function
brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely
that the transformation from a "constant" to a function is done at
compile time and we do not hook into the behaviour of constants
any longer.

Let me know what you think.

Thanks for proposing this, I'm in favor of this change because this
creates more consistency.

Is there a list of other "functions" that are handled as "constants"?

I don't think there are any other "functions" like this.

That's why removing the special case for "exit" and "die" is a good
idea.

cheers,
Derick

1 year ago by Jorg Sowa — view source

unread

I don't think there are any other "functions" like this.

What about list(), isset(), print(), echo(), require(), include(), unset(),
empty()? We use them the same way as functions, but those are not real
functions.

Kind regards,
Jorg

1 year ago by tim@bastelstu.be — view source

unread

Hi

I don't think there are any other "functions" like this.

What about list(), isset(), print(), echo(), require(), include(), unset(),
empty()? We use them the same way as functions, but those are not real
functions.

All of them require to be followed by an expression (or some other
non-empty token), thus they do not act like constants the same way
'exit' and 'die' do:

unset, isset, empty, list must be followed by '('.
echo is a statement and needs to be terminated with ';'.
print is an expression and requires a single non-empty expression.
The include family is an expression and requires a single non-empty
expression.

Best regards
Tim Düsterhus

PS: Today I learned that echo and print are not aliases.

1 year ago by Gina P. Banyard — view source

unread

I don't think there are any other "functions" like this.
What about list(), isset(), print(), echo(), require(), include(), unset(), empty()? We use them the same way as functions, but those are not real functions.
Kind regards,
Jorg

list() (and array()) may look like functions but do not behave like one as they affect the current scope by setting various variables.

isset()/empty()/unset() require to work with undefined variables and have deeply ingrained behaviour within the engine, so making them simple functions is not as much of a "trivial" change.

print, echo, include(_once) and require(_once) do not mandate their "argument" to be passed within parenthethis, so making them functions does not simplify the lexer/parser nor removes them as keywords.
Also I don't know the last time I've used those language constructs "like a function".

Hope this clarifies things.

Best regards,
Gina P. Banyard

1 year ago by Rowan Tommins [IMSoP] — view source

unread

print, echo, include(_once) and require(_once) do not mandate their "argument" to be passed within parenthethis, so making them functions does not simplify the lexer/parser nor removes them as keywords.

It's actually a much stronger difference than that: parentheses are not parsed as surrounding the argument lists for those keywords at all.

A while ago, I added notes to the manual pages of each showing how this can lead to misleading code, e.g. one of the examples on https://www.php.net/print is this:

print(1 + 2) * 3;
// outputs "9"; the parentheses cause 1+2 to be evaluated first, then 3*3
// the print statement sees the whole expression as one argument

echo has further peculiarities, because it takes an unparenthesised list of arguments, and can't be used in an expression.

While it would probably have been better if those had been parsed like functions to begin with, changing them now would not just be pointless, it would be actively dangerous, changing the behaviour of existing code.

Regards,
Rowan Tommins
[IMSoP]

1 year ago by Juliette Reinders Folmer — view source

unread

I don't think there are any other "functions" like this.

What about list(), isset(), print(), echo(), require(), include(),
unset(), empty()? We use them the same way as functions, but those
are not real functions.

list() (and array()) may look like functions but do not behave like
one as they affect the current scope by setting various variables.

isset()/empty()/unset() require to work with undefined variables and
have deeply ingrained behaviour within the engine, so making them
simple functions is not as much of a "trivial" change.

print, echo, include(_once) and require(_once) do not mandate their
"argument" to be passed within parenthethis, so making them functions
does not simplify the lexer/parser nor removes them as keywords.
Also I don't know the last time I've used those language constructs
"like a function".

Seeing this list, makes me wonder: what about eval() ?

1 year ago by Gina P. Banyard — view source

unread

I don't think there are any other "functions" like this.
What about list(), isset(), print(), echo(), require(), include(), unset(), empty()? We use them the same way as functions, but those are not real functions.

list() (and array()) may look like functions but do not behave like one as they affect the current scope by setting various variables.

isset()/empty()/unset() require to work with undefined variables and have deeply ingrained behaviour within the engine, so making them simple functions is not as much of a "trivial" change.

print, echo, include(_once) and require(_once) do not mandate their "argument" to be passed within parenthethis, so making them functions does not simplify the lexer/parser nor removes them as keywords.
Also I don't know the last time I've used those language constructs "like a function".

Seeing this list, makes me wonder: what about eval() ?

eval() might make sense to try and convert to a function, this would probably also cleany solve the complaint about not being able to disable eval() [1][2]

Best regards,
Gina P. Banyard

[1] https://bugs.php.net/bug.php?id=62397

[2] https://github.com/php/php-src/pull/4084

1 year ago by Larry Garfield — view source

unread

Hello Internals,

I would like to formally propose my idea for exit() as a function
brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that
the transformation from a "constant" to a function is done at compile
time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

I support this. My only question (which applies to current exit() as well), is what the exit code is when you pass a string as the parameter. That's not clear from the exit() docs currently, but the RFC made me wonder. (This may be just a doc fix and we move on with life.)

--Larry Garfield

1 year ago by tim@bastelstu.be — view source

unread

Hi

I support this. My only question (which applies to current exit() as well), is what the exit code is when you pass a string as the parameter. That's not clear from the exit() docs currently, but the RFC made me wonder. (This may be just a doc fix and we move on with life.)

If a string is passed, the exit_status is left untouched [1]. In
practice this should always result in an exit status of 0, because I
can't see how the exit_status would be non-zero before exit("foo");
successfully executes.

Best regards
Tim Düsterhus

[1]
https://github.com/php/php-src/blob/f88fc9c6e8262fe48eb9cf078e32e63af19047f2/Zend/zend_vm_def.h#L7541-L7570

1 year ago by Saki Takamachi — view source

unread

Hi Gina,

Hello Internals,

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

I generally agree :)

My only concern is that if this function throws an error and I catch it, won't it proceed against the user's intentions about it. I am concerned that this may lead to data corruption and personal information leaks due to unintentional data being inserted into the DB.

try {
    exit([]);
} catch (Throwable) {
    //
}

Probably no one likes to write code like this, and even if there are, there are probably very few, but code that is written with the assumption that exit will always stop processing may be a little dangerous.

Is it nonsense to always stop processing even if an error occurs? Or, how about creating an uncatchable exception class specifically for exits? Or is this level of risk negligible?

I am writing down the ideas I came up with, so please excuse me if there are any mistakes in my understanding.

Regard,

Saki

1 year ago by tim@bastelstu.be — view source

unread

Hi

Is it nonsense to always stop processing even if an error occurs? Or, how about creating an uncatchable exception class specifically for exits? Or is this level of risk negligible?

This is already the case. exit throws an internal uncatchable
Exception. Quoting from the RFC:

Finally, the need for exit() to be a language construct with its own dedicated opcode is not a requirement any more since PHP 8.0 as the opcode throws a special kind of exception which cannot be caught, 2) nor executes finally blocks, to unwind the stack normally.

Best regards
Tim Düsterhus

1 year ago by Saki Takamachi — view source

unread

Hi Tim,

This is already the case. exit throws an internal uncatchable Exception. Quoting from the RFC:

Thanks, I had overlooked that. I have no other concerns.

Regards,

Saki

1 year ago by Gina P. Banyard — view source

unread

Hi Gina,

Hello Internals,

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

I generally agree :)

My only concern is that if this function throws an error and I catch it, won't it proceed against the user's intentions about it. I am concerned that this may lead to data corruption and personal information leaks due to unintentional data being inserted into the DB.

try { exit([]); } catch (Throwable) { // }

Probably no one likes to write code like this, and even if there are, there are probably very few, but code that is written with the assumption that exit will always stop processing may be a little dangerous.

Is it nonsense to always stop processing even if an error occurs? Or, how about creating an uncatchable exception class specifically for exits? Or is this level of risk negligible?

I am writing down the ideas I came up with, so please excuse me if there are any mistakes in my understanding.

Regard,

Saki

You can already escape an exit() call, either by setting an error handler that throws, or by passing a non-stringable object.
See: https://3v4l.org/vR3up

Best regards,

Gina P. Banyard

1 year ago by Juliette Reinders Folmer — view source

unread

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

No objections from my side, though, yes, PHPCS will need to be
updated/adjusted to work around this, but that's no biggie.

When reading the RFC, there are two things about which I still have
questions.

As things are, exit and die cannot be used as a label for a goto
statement.

goto exit;

exit:
   echo 'exited';

https://3v4l.org/fluuk and https://3v4l.org/cNMEW

Will that change now exit would no longer be a reserved keyword ?

The RFC mentions the "old" semantics regarding type juggling for
exit/die - always cast to string - and it mentions that passing
resources or arrays in the new situation will become a TypeError, but
that still leaves some room for interpretation for the other types, in
particular the handling of booleans.

How I read the RFC, the type juggling would change as follows (but I may
well be wrong!):

| Param passed | Old | New | Consequences |
|-----------------------|---------|-----------|---------------------------------------------------------------------------------------------------------------|
| integer | integer | integer | No change, interpreted
as exit code |
| string | string | string | No change, interpreted
as status message |
| bool | string | integer | Was status message, now
exit code |
| float | string | integer | Was status message, now
exit code, "Implicit conversion from float to int loses precision"
deprecation notice |
| null | string | integer | Was status message, now
exit code, "Passing null to parameter #1 ($status) of type string|int
is deprecated" |
| stringable object | string | string | No change, interpreted
as status message |
| non-stringable object | string | TypeError | |
| array | string | TypeError | |
| resource | string | TypeError | |

Might it be an idea to make all the type juggling changes explicit in
the RFC ? (and correct whatever I interpreted incorrectly)

Smile,
Juliette

1 year ago by Gina P. Banyard — view source

unread

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function
There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

No objections from my side, though, yes, PHPCS will need to be updated/adjusted to work around this, but that's no biggie.

When reading the RFC, there are two things about which I still have questions.

As things are, exit and die cannot be used as a label for a goto statement.
goto exit;

exit:
echo 'exited';
https://3v4l.org/fluuk and https://3v4l.org/cNMEW

Will that change now exit would no longer be a reserved keyword ?

I had not thought about goto labels, but indeed it would be possible to use exit and die as goto labels with this RFC.
I have clarrified this change in the RFC text and added tests for this in the PR.

The RFC mentions the "old" semantics regarding type juggling for exit/die - always cast to string - and it mentions that passing resources or arrays in the new situation will become a TypeError, but that still leaves some room for interpretation for the other types, in particular the handling of booleans.

How I read the RFC, the type juggling would change as follows (but I may well be wrong!):

| Param passed | Old | New | Consequences |
|-----------------------|---------|-----------|---------------------------------------------------------------------------------------------------------------|
| integer | integer | integer | No change, interpreted as exit code |
| string | string | string | No change, interpreted as status message |
| bool | string | integer | Was status message, now exit code |
| float | string | integer | Was status message, now exit code, "Implicit conversion from float to int loses precision" deprecation notice |
| null | string | integer | Was status message, now exit code, "Passing null to parameter #1 ($status) of type string|int is deprecated" |
| stringable object | string | string | No change, interpreted as status message |
| non-stringable object | string | TypeError | |
| array | string | TypeError | |
| resource | string | TypeError | |

Might it be an idea to make all the type juggling changes explicit in the RFC ? (and correct whatever I interpreted incorrectly)

I have done so, and fixed the one case you got wrong.

Best regards,
Gina P. Banyard

1 year ago by Gina P. Banyard — view source

unread

Hello Internals,

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

As there haven't been any comments for nearly two weeks, I'm planning on opening the vote for the RFC on Tuesday.

Best regards,

Gina P. Banyard

1 year ago by Ilija Tovilo — view source

unread

Hi Gina

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

As there haven't been any comments for nearly two weeks, I'm planning on opening the vote for the RFC on Tuesday.

As mentioned early on in private, I don't see a convincing reason to
remove tokenizer/parser support for exit/die. I'd rather see this
handled in the parser directly, by converting the standalone keywords
to function calls. This avoids any backwards incompatibility, and
avoids special handling in zend_compile_const.

Another thing that's probably not too important: The PR likely breaks
dead code elimination for exit() and die(). This could be re-added by
checking for the never return type instead. You'd also need to
special-case the lookup of exit/die in namespaced code, where it will
always refer to the global function (as they cannot be declared in
userland).

Ilija

1 year ago by Gina P. Banyard — view source

unread

Hi Gina

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

As there haven't been any comments for nearly two weeks, I'm planning on opening the vote for the RFC on Tuesday.

As mentioned early on in private, I don't see a convincing reason to
remove tokenizer/parser support for exit/die. I'd rather see this
handled in the parser directly, by converting the standalone keywords
to function calls. This avoids any backwards incompatibility, and
avoids special handling in zend_compile_const.

I must be honest, I don't really understand how parsers work, so I went with what I could understand and implement.
But I'm struggling to see a reason for keeping the token in the first place, and what issues hooking into zend_compile_const() has.

Another thing that's probably not too important: The PR likely breaks
dead code elimination for exit() and die(). This could be re-added by
checking for the never return type instead.

Checking for a never return type seems more robust if it wasn't already supported by DCE.
I will see if I can do this.

You'd also need to
special-case the lookup of exit/die in namespaced code, where it will
always refer to the global function (as they cannot be declared in
userland).

This is a pre-existing issue with assert() too then, I am happy to fix this at the same time if the RFC passes.

Best regards,

Gina P. Banyard

1 year ago by Ilija Tovilo — view source

unread

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

As mentioned early on in private, I don't see a convincing reason to
remove tokenizer/parser support for exit/die. I'd rather see this
handled in the parser directly, by converting the standalone keywords
to function calls. This avoids any backwards incompatibility, and
avoids special handling in zend_compile_const.

I must be honest, I don't really understand how parsers work, so I went with what I could understand and implement.
But I'm struggling to see a reason for keeping the token in the first place, and what issues hooking into zend_compile_const() has.

Mostly because I think handling exit and die as constants is
misleading. exit; isn't a constant any more than yield; is. Instead,
you could turn exit; into the same AST as exit(); (i.e. a function
call), which will make the compiler handle it automatically. If you
wish, I can have a quick look.

Another thing that's probably not too important: The PR likely breaks
dead code elimination for exit() and die(). This could be re-added by
checking for the never return type instead.

Checking for a never return type seems more robust if it wasn't already supported by DCE.
I will see if I can do this.

That would be great! I agree that this can be done in retrospect.

Ilija

1 year ago by Derick Rethans — view source

unread

On Sun, May 26, 2024 at 11:47 PM Gina P. Banyard internals@gpb.moe
wrote:

On Wednesday, 8 May 2024 at 14:40, Gina P. Banyard
internals@gpb.moe wrote:

I would like to formally propose my idea for exit() as a function
brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

As there haven't been any comments for nearly two weeks, I'm
planning on opening the vote for the RFC on Tuesday.

Right now, I am not going to be in favour of this. Specfically because
the Backward Incompatible Changes don't list some of the backward
breaking changes, and no mitigations are provided. See below.

I also don't see the benefit from doing this in the first place.
exit/die has always meant "bail straight out", without any function
semantics (from https://www.php.net/exit):

exit — Output a message and terminate the current script

exit is a language construct and it can be called without 
parentheses if no status is passed.

It's not that this a particularly new or novel behaviour.

I saw somewhere else in the thread that the exit construct could already
throw a catchable exception. I consider that a bug.

As mentioned early on in private, I don't see a convincing reason to
remove tokenizer/parser support for exit/die. I'd rather see this
handled in the parser directly, by converting the standalone keywords
to function calls. This avoids any backwards incompatibility, and
avoids special handling in zend_compile_const.

Another thing that's probably not too important: The PR likely breaks
dead code elimination for exit() and die().

I have just checked this for Xdebug, and you're definitely right with
that. With the removal of the ZEND_EXIT opcode, it can not now detect a
scope/function exit now.

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It is used
to always write the closing profiling footer to the files. Without me
being able to overload thati opcode, data would not be complete.
I even have two bugs (with tests) for this:

Xdebug has been overloading ZEND_EXIT for nearly 20 years now.

This could be re-added by checking for the never return type instead.

I can't quite see a way on how to do that? The EXIT is replaced by an
INIT_FCALL/DO_ICALL:

 4     0  E >   EXT_STMT    
       1        ECHO        '55'
 5     2        EXT_STMT

```
     3      > EXIT        
```
6 4* EXT_STMT
```
     5*       ECHO        '675'
```
7 6* EXT_STMT
```
     7*     > RETURN      null
```

```
     3        INIT_FCALL  'exit'
```
```
     4        DO_ICALL    
```
6 5 EXT_STMT
```
     6        ECHO        '675'
```
7 7 EXT_STMT
```
     8      > RETURN      null
```

The opcodes don't have direct access to the type mask structure as far
as I know.

cheers,
Derick

1 year ago by Gina P. Banyard — view source

unread

On Sun, May 26, 2024 at 11:47 PM Gina P. Banyard internals@gpb.moe
wrote:

On Wednesday, 8 May 2024 at 14:40, Gina P. Banyard
internals@gpb.moe wrote:

I would like to formally propose my idea for exit() as a function
brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

As there haven't been any comments for nearly two weeks, I'm
planning on opening the vote for the RFC on Tuesday.

Right now, I am not going to be in favour of this. Specfically because
the Backward Incompatible Changes don't list some of the backward
breaking changes, and no mitigations are provided. See below.

I also don't see the benefit from doing this in the first place.
exit/die has always meant "bail straight out", without any function
semantics (from https://www.php.net/exit):

exit — Output a message and terminate the current script

exit is a language construct and it can be called without
parentheses if no status is passed.

It's not that this a particularly new or novel behaviour.

I saw somewhere else in the thread that the exit construct could already
throw a catchable exception. I consider that a bug.

Considering the whole point of the RFC is to make it a function, I'm not sure why saying the current exit behaviour is not new or novel serves any purpose.
Is catching a TypeError a bug? Is catching a ValueError a bug? Is catching any sort of engine Error a bug?
Because this is effectively what you are saying, and if this is the case, we might as well just reverse the "Exceptions in the engine" RFC from 2014 [1] to make all those Errors fatal errors again which bail out immediately.
This RFC makes exit() a function such that either it exits normally, or a TypeError is thrown if you feed it nonsense, consistently with all our type juggling semantics.

Moreover, when exit() was changed to use an Exception internally there were various people in favour of being able to catch it, and have it run finally blocks (like other programming languages do) [2]
It should also be noted exit() does not just bail out and do nothing else, any destructors will be run before the exiting, so one can still run userland code after it has been called [3]

I have just checked this for Xdebug, and you're definitely right with
that. With the removal of the ZEND_EXIT opcode, it can not now detect a
scope/function exit now.

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It is used
to always write the closing profiling footer to the files. Without me
being able to overload thati opcode, data would not be complete.
I even have two bugs (with tests) for this:

https://bugs.xdebug.org/68

https://bugs.xdebug.org/631

Xdebug has been overloading ZEND_EXIT for nearly 20 years now.

AFAIK Opcodes are not part of any backwards compatibility guarantees.
You could overwrite the function pointer of exit() at MINIT with a custom implementation.
Alternatively, you could use the zend_is_unwind_exit() engine API to check if the exception that has been thrown is the one that corresponds to an exit call, this would work as of PHP 8.0.
Finally, the one that makes more work for me, is to add a new observer API that indicates an exit() and hook into that one.

This could be re-added by checking for the never return type instead.

I can't quite see a way on how to do that? The EXIT is replaced by an
INIT_FCALL/DO_ICALL:

4 0 E > EXT_STMT

1 ECHO '55'
5 2 EXT_STMT

3 > EXIT

6 4* EXT_STMT

5* ECHO '675'

7 6* EXT_STMT

7* > RETURN null

3 INIT_FCALL 'exit'

4 DO_ICALL

6 5 EXT_STMT

6 ECHO '675'

7 7 EXT_STMT

8 > RETURN null

The opcodes don't have direct access to the type mask structure as far
as I know.

You can access the zend_function pointer from the opcode which has access to the type-mask info.
Or, if required, the Zend/Optimizer/zend_func_infos.h file can be amended to include exit() and die() and expose the MAY_BE_NEVER return type.

Best regards,

Gina P. Banyard

[1] https://wiki.php.net/rfc/engine_exceptions_for_php7
[2] https://externals.io/message/107497
[3] https://3v4l.org/gO9IK

1 year ago by Derick Rethans — view source

unread

On Sun, May 26, 2024 at 11:47 PM Gina P. Banyard internals@gpb.moe
wrote:

On Wednesday, 8 May 2024 at 14:40, Gina P. Banyard
internals@gpb.moe wrote:

I would like to formally propose my idea for exit() as a
function brought up to the list on 2024-02-24 [1] with the
following RFC: https://wiki.php.net/rfc/exit-as-function

As there haven't been any comments for nearly two weeks, I'm
planning on opening the vote for the RFC on Tuesday.

Right now, I am not going to be in favour of this. Specfically
because the Backward Incompatible Changes don't list some of the
backward breaking changes, and no mitigations are provided. See
below.

I also don't see the benefit from doing this in the first place.
exit/die has always meant "bail straight out", without any function
semantics (from https://www.php.net/exit):

exit — Output a message and terminate the current script

exit is a language construct and it can be called without
parentheses if no status is passed.

It's not that this a particularly new or novel behaviour.

I saw somewhere else in the thread that the exit construct could
already throw a catchable exception. I consider that a bug.

Considering the whole point of the RFC is to make it a function, I'm
not sure why saying the current exit behaviour is not new or novel
serves any purpose.

I understand that that is the point, but I don't think it's argued why
that needs to happen.

Is catching a TypeError a bug? Is catching a ValueError a bug? Is
catching any sort of engine Error a bug?

exit throwing any Exception is the bug I was referring to.

Because this is effectively what you are saying, and if this is the
case, we might as well just reverse the "Exceptions in the engine" RFC
from 2014 [1] to make all those Errors fatal errors again which bail
out immediately. This RFC makes exit() a function such that either it
exits normally, or a TypeError is thrown if you feed it nonsense,
consistently with all our type juggling semantics.

Moreover, when exit() was changed to use an Exception internally there
were various people in favour of being able to catch it, and have it
run finally blocks (like other programming languages do) [2] It should
also be noted exit() does not just bail out and do nothing else, any
destructors will be run before the exiting, so one can still run
userland code after it has been called [3]

I have just checked this for Xdebug, and you're definitely right
with that. With the removal of the ZEND_EXIT opcode, it can not now
detect a scope/function exit now.

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It is
used to always write the closing profiling footer to the files.
Without me being able to overload thati opcode, data would not be
complete. I even have two bugs (with tests) for this:

https://bugs.xdebug.org/68

https://bugs.xdebug.org/631

Xdebug has been overloading ZEND_EXIT for nearly 20 years now.

AFAIK Opcodes are not part of any backwards compatibility guarantees.

AFAIK it isn't documented otherwise either. Just because it's not
documented, doesn't mean we should remove long term existing
functionality.

You could overwrite the function pointer of exit() at MINIT with a
custom implementation.

I can, but overriding functions for this sort of thing is a little bit
of a hack, and I prefer having an actual API. It is also a runtime
feature, not a compile time (see below again).

Alternatively, you could use the zend_is_unwind_exit() engine API to
check if the exception that has been thrown is the one that
corresponds to an exit call, this would work as of PHP 8.0.

Finally, the one that makes more work for me, is to add a new observer
API that indicates an exit() and hook into that one.

That seems a fairly sensible thing to add. But that only addresses my
profiler use case, and not the branch/path detection use case.

This could be re-added by checking for the never return type instead.

I can't quite see a way on how to do that? The EXIT is replaced by an
INIT_FCALL/DO_ICALL:
 4     0  E >   EXT_STMT
       1        ECHO           '55'
 5     2        EXT_STMT
     3      > EXIT
6 4* EXT_STMT
     5*       ECHO           '675'
7 6* EXT_STMT
     7*     > RETURN         null
     3        INIT_FCALL     'exit'
     4        DO_ICALL
6 5 EXT_STMT
     6        ECHO           '675'
7 7 EXT_STMT
     8      > RETURN         null
The opcodes don't have direct access to the type mask structure as
far as I know.
You can access the zend_function pointer from the opcode which has
access to the type-mask info.

Isn't that only during runtime, and not compile time?

The branch/path detection works with the op_array, just after it has
been created through compile_file. I don't see any structure in the
DO_ICALL opcode that has access to this never flag.

Or, if required, the Zend/Optimizer/zend_func_infos.h file can be
amended to include exit() and die() and expose the MAY_BE_NEVER return
type.

I don't think that helps either, as that is also run time?

cheers,
Derick

1 year ago by tim@bastelstu.be — view source

unread

Hi

I have just checked this for Xdebug, and you're definitely right with
that. With the removal of the ZEND_EXIT opcode, it can not now detect a
scope/function exit now.

Can you clarify why ZEND_EXIT is special for Xdebug when compared to any
other function that unconditionally throws or unconditionally calls
exit(); by itself, i.e. any other function with a never return type?

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It is used
to always write the closing profiling footer to the files. Without me
being able to overload thati opcode, data would not be complete.
I even have two bugs (with tests) for this:

https://bugs.xdebug.org/68

https://bugs.xdebug.org/631

Likewise, how is ZEND_EXIT special here? How does it work differently
than a script that runs to completion without calling exit(); or a
script that fails, e.g. due to an uncaught exception or due to reaching
the memory limit?

Best regards
Tim Düsterhus

1 year ago by Derick Rethans — view source

unread

I have just checked this for Xdebug, and you're definitely right
with that. With the removal of the ZEND_EXIT opcode, it can not now
detect a scope/function exit now.

Can you clarify why ZEND_EXIT is special for Xdebug when compared to
any other function that unconditionally throws or unconditionally
calls exit(); by itself, i.e. any other function with a never return
type?

For each op_array, Xdebug tries to figure out every possible path by
following jumps. Certain opcodes, such as GENERATOR_RETURN, THROW,
RETURN, and EXIT [1] are considered as an exit point out of the
function. A path ends there.

If there is a function call to a function with a 'never' return type,
then that function will potentially throw, or exit.

But that's not relevant for the analysis, as these userland functions
will have their own oparrays with their entry and exit points.

I can't through this mechanism know when an internal function does
something similar, as it looks like just a normal function call (unless
I hard code the check for a call to 'exit').

[1] https://github.com/xdebug/xdebug/blob/master/src/coverage/code_coverage.c#L348

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It is
used to always write the closing profiling footer to the files.
Without me being able to overload thati opcode, data would not be
complete. I even have two bugs (with tests) for this:

https://bugs.xdebug.org/68

https://bugs.xdebug.org/631

Likewise, how is ZEND_EXIT special here? How does it work differently than a
script that runs to completion without calling exit(); or a script that fails,
e.g. due to an uncaught exception or due to reaching the memory limit?

I overload EXIT so that I can flush the profile file before the script
actually fully ends. This is useful for testing through phpt tests. It
looks like I might be able to use existing function observers for this,
but I haven't fully made that work yet.

cheers,
Derick

--
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support

mastodon: @derickr@phpc.social @xdebug@phpc.social

1 year ago by tim@bastelstu.be — view source

unread

Hi

For each op_array, Xdebug tries to figure out every possible path by
following jumps. Certain opcodes, such as GENERATOR_RETURN, THROW,
RETURN, and EXIT [1] are considered as an exit point out of the
function. A path ends there.

You linked to a code_coverage.c. Is this information just used for code
coverage analysis to indicate that unreachable code is unreachable and
thus uncoverable or is there more to this?

If there is a function call to a function with a 'never' return type,
then that function will potentially throw, or exit.

But that's not relevant for the analysis, as these userland functions
will have their own oparrays with their entry and exit points.

I can't through this mechanism know when an internal function does
something similar, as it looks like just a normal function call (unless
I hard code the check for a call to 'exit').

The compiler has the function table available. It is used to optimize
specific functions into dedicated Opcodes. Thus you should be able to
look up the function within the function table and then check its return
type.

[1] https://github.com/xdebug/xdebug/blob/master/src/coverage/code_coverage.c#L348

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It is
used to always write the closing profiling footer to the files.
Without me being able to overload thati opcode, data would not be
complete. I even have two bugs (with tests) for this:

https://bugs.xdebug.org/68

https://bugs.xdebug.org/631

Likewise, how is ZEND_EXIT special here? How does it work differently than a
script that runs to completion without calling exit(); or a script that fails,
e.g. due to an uncaught exception or due to reaching the memory limit?

I overload EXIT so that I can flush the profile file before the script
actually fully ends. This is useful for testing through phpt tests. It
looks like I might be able to use existing function observers for this,
but I haven't fully made that work yet.

Why does the EXIT opcode need to be special-cased here? What if a script
dies due to an uncaught Exception, due to exceeding the memory limit, or
simply terminates by successfully running until the finish, without
explicitly calling exit()?

Best regards
Tim Düsterhus

1 year ago by Derick Rethans — view source

unread

For each op_array, Xdebug tries to figure out every possible path by
following jumps. Certain opcodes, such as GENERATOR_RETURN, THROW,
RETURN, and EXIT [1] are considered as an exit point out of the
function. A path ends there.

You linked to a code_coverage.c. Is this information just used for
code coverage analysis to indicate that unreachable code is
unreachable and thus uncoverable or is there more to this?

It's for that, but also for path/branch analysis, as I just wrote above.

Having an EXIT opcode, instead of a function call is a clean indicator
that a path (and branch) ends here.

With a function call, I have no idea about whether a path ends there.

If there is a function call to a function with a 'never' return
type, then that function will potentially throw, or exit.

But that's not relevant for the analysis, as these userland
functions will have their own oparrays with their entry and exit
points.

The compiler has the function table available. It is used to optimize
specific functions into dedicated Opcodes. Thus you should be able to
look up the function within the function table and then check its
return type.

Yes, but functions that call exit are not required to have the 'never'
return type, so that's not useful.

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It
is used to always write the closing profiling footer to the
files. Without me being able to overload thati opcode, data
would not be complete. I even have two bugs (with tests) for
this:

https://bugs.xdebug.org/68 - https://bugs.xdebug.org/631

Likewise, how is ZEND_EXIT special here? How does it work
differently than a script that runs to completion without calling
exit(); or a script that fails, e.g. due to an uncaught exception
or due to reaching the memory limit?

I overload EXIT so that I can flush the profile file before the
script actually fully ends. This is useful for testing through phpt
tests. It looks like I might be able to use existing function
observers for this, but I haven't fully made that work yet.

Why does the EXIT opcode need to be special-cased here? What if a
script dies due to an uncaught Exception, due to exceeding the memory
limit, or simply terminates by successfully running until the finish,
without explicitly calling exit()?

It's not important for normal functionality as the execution still ends
there. It's useful for testing as I said before.

Best regards
Tim Düsterhus

--
https://derickrethans.nl | https://xdebug.org | https://dram.io

Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support

mastodon: @derickr@phpc.social @xdebug@phpc.social

1 year ago by tim@bastelstu.be — view source

unread

Hi

It's for that, but also for path/branch analysis, as I just wrote above.

Having an EXIT opcode, instead of a function call is a clean indicator
that a path (and branch) ends here.

With a function call, I have no idea about whether a path ends there.

If the function called the return type never then you know that the
path ends there. As the exit() function cannot be overridden, you are
even guaranteed that any call to a function called exit() is the real
exit() function. Either the script exits or the function throws a
TypeError. The latter is true even with current PHP versions as
demonstrated by this example script:

 &lt;?php

 namespace Foo;

 function a() {
 	exit(new \stdClass());
 }

 try {
 	a();
 } catch (\Throwable) {}
 echo "executed", PHP_EOL;

If there is a function call to a function with a 'never' return
type, then that function will potentially throw, or exit.

But that's not relevant for the analysis, as these userland
functions will have their own oparrays with their entry and exit
points.

The compiler has the function table available. It is used to optimize
specific functions into dedicated Opcodes. Thus you should be able to
look up the function within the function table and then check its
return type.

Yes, but functions that call exit are not required to have the 'never'
return type, so that's not useful.

The exit() function as implemented in Gina's PR has the never return
type, thus the situation is no worse than the current situation. Instead
of checking for the ZEND_EXIT opcode, you check for a function call and
if the function has the never return type (which exit() is
guaranteed to have), then you can treat it as a path ending there. Or
you can just hardcode a check for a call to the exit() function, as
outlined above.

It also breaks my "do tasks on ZEND_EXIT" with the profiler. It
is used to always write the closing profiling footer to the
files. Without me being able to overload thati opcode, data
would not be complete. I even have two bugs (with tests) for
this:

https://bugs.xdebug.org/68 - https://bugs.xdebug.org/631

Likewise, how is ZEND_EXIT special here? How does it work
differently than a script that runs to completion without calling
exit(); or a script that fails, e.g. due to an uncaught exception
or due to reaching the memory limit?

I overload EXIT so that I can flush the profile file before the
script actually fully ends. This is useful for testing through phpt
tests. It looks like I might be able to use existing function
observers for this, but I haven't fully made that work yet.

Why does the EXIT opcode need to be special-cased here? What if a
script dies due to an uncaught Exception, due to exceeding the memory
limit, or simply terminates by successfully running until the finish,
without explicitly calling exit()?

It's not important for normal functionality as the execution still ends
there. It's useful for testing as I said before.

It was not clear to me that this is only relevant for testing purposes.
I don't understand why you would need the special handling for testing
purposes, but as outlined above you can just hardcode a check for a call
to the exit() function if the tests cannot use the regular script end
for some reason.

Best regards
Tim Düsterhus

1 year ago by Rob Landers — view source

unread

If there is a function call to a function with a 'never' return
type, then that function will potentially throw, or exit.

But that's not relevant for the analysis, as these userland
functions will have their own oparrays with their entry and exit
points.

The compiler has the function table available. It is used to optimize
specific functions into dedicated Opcodes. Thus you should be able to
look up the function within the function table and then check its
return type.

Yes, but functions that call exit are not required to have the 'never'
return type, so that's not useful.

The exit() function as implemented in Gina's PR has the never return
type, thus the situation is no worse than the current situation. Instead
of checking for the ZEND_EXIT opcode, you check for a function call and
if the function has the never return type (which exit() is
guaranteed to have), then you can treat it as a path ending there. Or
you can just hardcode a check for a call to the exit() function, as
outlined above.

Just to clarify: a never return type doesn't mean execution ends there.

throwing:
https://3v4l.org/s3KDM

infinite loop:
https://3v4l.org/Lpbtt

— Rob

1 year ago by Claude Pache — view source

unread

Le 8 mai 2024 à 15:40, Gina P. Banyard internals@gpb.moe a écrit :

Hello Internals,

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

Hi Gina,

It is ok, for exit to be wired to a function. However, the paren-less exit syntax is absolutely reasonable and should be kept in the long term (more on this below). It is true that your proposal doesn’t remove the paren-less form, but the implementation (a hack around constant evaluation) strongly suggests that it is desirable to deprecate it in the future (hey, it is even the very first item of the Future scope section). Therefore, I think it is preferable to keep proper parsing rules for exit.

Now here is why I think that exit syntax without parentheses is reasonable. For me, exit is a control-flow instruction, saying to terminate the program, just like return terminates a function or break terminates a loop. I don’t care about implementation details, whether it is implemented as invoking a never-returning function, or as throwing a hidden exception, or whatever. The fact is, in typical use:

header("Location: /somewhere/else.php");
exit;

I would consider an oddity and useless noise to put empty parentheses after a bare exit, just as I wouldn’t consider to put empty parentheses after a bare return.

(BTW... I wouldn’t describe a bare yield (equivalent to yield null) as “constant” evaluation, even if it may be found in same positions as a “real” constant.)

—Claude

1 year ago by Gina P. Banyard — view source

unread

Hi Gina,

It is ok, for exit to be wired to a function. However, the paren-less exit syntax is absolutely reasonable and should be kept in the long term (more on this below). It is true that your proposal doesn’t remove the paren-less form, but the implementation (a hack around constant evaluation) strongly suggests that it is desirable to deprecate it in the future (hey, it is even the very first item of the Future scope section). Therefore, I think it is preferable to keep proper parsing rules for exit.

Now here is why I think that exit syntax without parentheses is reasonable. For me, exit is a control-flow instruction, saying to terminate the program, just like return terminates a function or break terminates a loop. I don’t care about implementation details, whether it is implemented as invoking a never-returning function, or as throwing a hidden exception, or whatever. The fact is, in typical use:

php header("Location: /somewhere/else.php"); exit;

I would consider an oddity and useless noise to put empty parentheses after a bare exit, just as I wouldn’t consider to put empty parentheses after a bare return.

(BTW... I wouldn’t describe a bare yield (equivalent to yield null) as “constant” evaluation, even if it may be found in same positions as a “real” constant.)

—Claude

Adding something to Future Scope does not mean I will pursue this, this is just a possible direction that can be done and needs to be argued on its own merits.
And frankly I have no intension on pushing this, I would rather remove support for passing strings to exit() than removing the paren-less form.

It should be noted that in other programming languages, exit() is a function that requires an argument for the status code.
So I don't think of it as a control-flow instruction (because I don't see terminating something as control flow) but if that's how you see it fine.
If that's the only reason for keeping proper parsing rules, I'm not sure whether that's worthwhile IMHO.

Best regards,

Gina P. Banyard

1 year ago by Daikaras — view source

unread

Hello Internals,

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483
I don't quite get the reasoning behind this change. exit behavior is
not consistent with functions because it is not a function. But it is
consistent with other constructs like echo. Or am I wrong? I'm all for
stricter typing but not at a cost of reliability. exit IMO should
always terminate script, no exceptions (no pun intended :)). Otherwise,
it may have security implications in some codebases, as Saki pointed
out. So I'd rather go the other way and have it error fatally like for
example break does when passed non-integers. And on the same note
maybe syntax could be improved to not require parentheses when passing
the code (so exit 0; would be valid code) since this seems to be the
cause of the confusion.

Just my 2c.

p.s. In B/C table you should mention that passing Stringable objects
will throw TypeError when strict_types is enabled.

1 year ago by Gina P. Banyard — view source

unread

Hello Internals,

I would like to formally propose my idea for exit() as a function brought up to the list on 2024-02-24 [1] with the following RFC:
https://wiki.php.net/rfc/exit-as-function

There have been some slight tweaks to the implementation, namely that the transformation from a "constant" to a function is done at compile time and we do not hook into the behaviour of constants any longer.

Let me know what you think.

Best regards,

Gina P. Banyard

[1] https://externals.io/message/122483

I have updated the RFC and implementation to now keep the T_EXIT token after request.
https://wiki.php.net/rfc/exit-as-function

I will open the voting tomorrow so that the RFC can be accepted for 8.4.

Best regards,

Gina P. Banyard