Reviving scalar type hints

10 years ago by francois@php.net — view source

unread

Hi,

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]

The 0.1 RFC version was mentioned a lot as a good compromise by many
people
and had major support.
Maybe someone competent could pick it up, make necessary adjustments
that
where required and let people vote on it? Start with small steps - get the
weak type hints into the language first, see how it gets used and then we
can always add strict type hints if there is a need/desire to do that.

That way we finally get type hints into the language, and those wanting the
strict variety have all the opportunities in the world to add them at a
later release with proper discussion and development time.

That's what I am planning. If I write an RFC, it will be based on Andrea's 0.1/0.2 version, and won't propose different modes.

The problem is that the previous controversial RFC focused people on weak vs strict typing, while we should have explored other technical concerns. Here are the main ones I see :

the fact that the RFC supports single types only, like the previous 'return type' RFC. While it is easier to implement, it opens several issues as multiply-typed arguments are an integral part of the PHP language (mostly completeness and compatibility with internal function hinting). If we want to support multiple types the same way for internal and userspace functions, we must extend the ZPP layer to support it.
the mechanism to check for type hints on internal functions, while easy to implement, is not sufficient, as a lot of internal functions get a bare zval from the parsing system and then convert it by themselves. With the proposed mechanism, there's no possible hinting on such argument, which will make the implementation different from the documentation. Even if the check is done by the function body, it won't be done in a consistent way with type hinting checks and won't raise a similar error. As most cases are related to multiply-typed args, the solution is in adding multiply-typed support to ZPP. Multiply-typed support needs to redefine scalar conversion rules, to take care of the target type being a combination of single types.
We need to define the appropriate extension to Reflection parameters/return type. That's not complex, but it takes time.
Other changes I'd like to propose are exposed in Bob Weinand's article, at https://gist.github.com/bwoebi/b4c5564388ecd004ba96. The article explains how restricting weak conversion possibilities would make strict typing almost useless. Changes include forbidding bool to int/float or '7years' to int. This cannot be left for future additions as BC breaks will make it impossible. To remain consistent between userspace/internal functions, this must also be done at the ZPP level.
Using bare class names as type hints is a potential issue too, as it makes reserved keywords and class names share the same naming space. I think we should deprecate the use of class names as type hints in favor of 'object(class-name)'. If we don't do that, every future addition of a type hint keyword will cause a BC break (and will be practically impossible).
Additional 'hybrid' types like 'numeric' and 'mixed' should be also provided.

So, most features I have in mind are really 'now or never'.

My main concern, anyway, is with March 15 announced feature freeze. If we need a vote by this date, it's impossible. And planning such BC for 7.1 is probably unrealistic because of the huge syntax additions and BC breaks it brings. So, if it's too late for an inclusion in 7.0, I think I'll give up.

So, could someone confirm what 'feature freeze' exactly means ?

Regards

François

The main issues are completeness (we can give hints for some cases, but not for others) and, more important, the compatibility with internal functions. As Andrea herself agreed, her mechanism for type hinting on internal functions is not sufficient. Just using the ZPP macros, as they exist today, won't work as a lot of internal functions get a bare zval and then convert it by themselves. So, in this case, we would check nothing. So, an argument described as 'string|array' in the documentation, wouldn't produce the same sort of error when sent an object, than its friend, described as 'string'. This is not consistent and will open a lot of side effects if it is left out of the type hint layer.

10 years ago by Peter Cowburn — view source

unread

Hi,

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]

The 0.1 RFC version was mentioned a lot as a good compromise by many
people
and had major support.
Maybe someone competent could pick it up, make necessary adjustments
that
where required and let people vote on it? Start with small steps - get
the
weak type hints into the language first, see how it gets used and then we
can always add strict type hints if there is a need/desire to do that.

That way we finally get type hints into the language, and those wanting
the
strict variety have all the opportunities in the world to add them at a
later release with proper discussion and development time.

That's what I am planning. If I write an RFC, it will be based on Andrea's
0.1/0.2 version, and won't propose different modes.

The problem is that the previous controversial RFC focused people on weak
vs strict typing, while we should have explored other technical concerns.
Here are the main ones I see :

the fact that the RFC supports single types only, like the previous
'return type' RFC. While it is easier to implement, it opens several issues
as multiply-typed arguments are an integral part of the PHP language
(mostly completeness and compatibility with internal function hinting). If
we want to support multiple types the same way for internal and userspace
functions, we must extend the ZPP layer to support it.

the mechanism to check for type hints on internal functions, while easy
to implement, is not sufficient, as a lot of internal functions get a bare
zval from the parsing system and then convert it by themselves. With the
proposed mechanism, there's no possible hinting on such argument, which
will make the implementation different from the documentation. Even if the
check is done by the function body, it won't be done in a consistent way
with type hinting checks and won't raise a similar error. As most cases are
related to multiply-typed args, the solution is in adding multiply-typed
support to ZPP. Multiply-typed support needs to redefine scalar conversion
rules, to take care of the target type being a combination of single types.

We need to define the appropriate extension to Reflection
parameters/return type. That's not complex, but it takes time.

Other changes I'd like to propose are exposed in Bob Weinand's article,
at https://gist.github.com/bwoebi/b4c5564388ecd004ba96. The article
explains how restricting weak conversion possibilities would make strict
typing almost useless. Changes include forbidding bool to int/float or
'7years' to int. This cannot be left for future additions as BC breaks will
make it impossible. To remain consistent between userspace/internal
functions, this must also be done at the ZPP level.

Using bare class names as type hints is a potential issue too, as it
makes reserved keywords and class names share the same naming space. I
think we should deprecate the use of class names as type hints in favor of
'object(class-name)'. If we don't do that, every future addition of a type
hint keyword will cause a BC break (and will be practically impossible).

Additional 'hybrid' types like 'numeric' and 'mixed' should be also
provided.

So, most features I have in mind are really 'now or never'.

My main concern, anyway, is with March 15 announced feature freeze. If we
need a vote by this date, it's impossible. And planning such BC for 7.1 is
probably unrealistic because of the huge syntax additions and BC breaks it
brings. So, if it's too late for an inclusion in 7.0, I think I'll give up.

So, could someone confirm what 'feature freeze' exactly means ?

Note that the accepted timeline for PHP 7 [1] states that we will have
three months to "Finalize implementation & testing of new features", after
the March deadline. That should give us some time to tie down the actual
implementation side of things. Apart from that, it's more a matter of
getting an RFC drafted, discussed, voted on, before the March deadline.

I, for one, wouldn't mind the vote taking a wee while longer (say, a week
or two) if it means we can get this feature added.

Regards

François

The main issues are completeness (we can give hints for some cases, but
not for others) and, more important, the compatibility with internal
functions. As Andrea herself agreed, her mechanism for type hinting on
internal functions is not sufficient. Just using the ZPP macros, as they
exist today, won't work as a lot of internal functions get a bare zval and
then convert it by themselves. So, in this case, we would check nothing.
So, an argument described as 'string|array' in the documentation, wouldn't
produce the same sort of error when sent an object, than its friend,
described as 'string'. This is not consistent and will open a lot of side
effects if it is left out of the type hint layer.

10 years ago by Arvids Godjuks — view source

unread

2015-02-16 18:42 GMT+02:00 François Laupretre francois@php.net:

Hi,

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]

The 0.1 RFC version was mentioned a lot as a good compromise by many
people
and had major support.
Maybe someone competent could pick it up, make necessary adjustments
that
where required and let people vote on it? Start with small steps - get
the
weak type hints into the language first, see how it gets used and then we
can always add strict type hints if there is a need/desire to do that.

That way we finally get type hints into the language, and those wanting
the
strict variety have all the opportunities in the world to add them at a
later release with proper discussion and development time.

That's what I am planning. If I write an RFC, it will be based on Andrea's
0.1/0.2 version, and won't propose different modes.

The problem is that the previous controversial RFC focused people on weak
vs strict typing, while we should have explored other technical concerns.
Here are the main ones I see :

the fact that the RFC supports single types only, like the previous
'return type' RFC. While it is easier to implement, it opens several issues
as multiply-typed arguments are an integral part of the PHP language
(mostly completeness and compatibility with internal function hinting). If
we want to support multiple types the same way for internal and userspace
functions, we must extend the ZPP layer to support it.

the mechanism to check for type hints on internal functions, while easy
to implement, is not sufficient, as a lot of internal functions get a bare
zval from the parsing system and then convert it by themselves. With the
proposed mechanism, there's no possible hinting on such argument, which
will make the implementation different from the documentation. Even if the
check is done by the function body, it won't be done in a consistent way
with type hinting checks and won't raise a similar error. As most cases are
related to multiply-typed args, the solution is in adding multiply-typed
support to ZPP. Multiply-typed support needs to redefine scalar conversion
rules, to take care of the target type being a combination of single types.

We need to define the appropriate extension to Reflection
parameters/return type. That's not complex, but it takes time.

Other changes I'd like to propose are exposed in Bob Weinand's article,
at https://gist.github.com/bwoebi/b4c5564388ecd004ba96. The article
explains how restricting weak conversion possibilities would make strict
typing almost useless. Changes include forbidding bool to int/float or
'7years' to int. This cannot be left for future additions as BC breaks will
make it impossible. To remain consistent between userspace/internal
functions, this must also be done at the ZPP level.

Using bare class names as type hints is a potential issue too, as it
makes reserved keywords and class names share the same naming space. I
think we should deprecate the use of class names as type hints in favor of
'object(class-name)'. If we don't do that, every future addition of a type
hint keyword will cause a BC break (and will be practically impossible).

Additional 'hybrid' types like 'numeric' and 'mixed' should be also
provided.

So, most features I have in mind are really 'now or never'.

My main concern, anyway, is with March 15 announced feature freeze. If we
need a vote by this date, it's impossible. And planning such BC for 7.1 is
probably unrealistic because of the huge syntax additions and BC breaks it
brings. So, if it's too late for an inclusion in 7.0, I think I'll give up.

So, could someone confirm what 'feature freeze' exactly means ?

Regards

François

The main issues are completeness (we can give hints for some cases, but
not for others) and, more important, the compatibility with internal
functions. As Andrea herself agreed, her mechanism for type hinting on
internal functions is not sufficient. Just using the ZPP macros, as they
exist today, won't work as a lot of internal functions get a bare zval and
then convert it by themselves. So, in this case, we would check nothing.
So, an argument described as 'string|array' in the documentation, wouldn't
produce the same sort of error when sent an object, than its friend,
described as 'string'. This is not consistent and will open a lot of side
effects if it is left out of the type hint layer.

Sounds quite reasonable and level headed. I'm gonna contact you off-list to
assist in any way I can, i'll be damned if RFC fails 4th time...

10 years ago by Dmitry Stogov — view source

unread

On Mon, Feb 16, 2015 at 7:42 PM, François Laupretre francois@php.net
wrote:

Hi,

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]

The 0.1 RFC version was mentioned a lot as a good compromise by many
people
and had major support.
Maybe someone competent could pick it up, make necessary adjustments
that
where required and let people vote on it? Start with small steps - get
the
weak type hints into the language first, see how it gets used and then we
can always add strict type hints if there is a need/desire to do that.

That way we finally get type hints into the language, and those wanting
the
strict variety have all the opportunities in the world to add them at a
later release with proper discussion and development time.

That's what I am planning. If I write an RFC, it will be based on Andrea's
0.1/0.2 version, and won't propose different modes.

I would propose exactly Andrea's 0.1.
Most people were agree to support weak type hints by default.
This proposal won't prevent feature addition of optional strict type hints.
All are tired from endless arguing.

The problem is that the previous controversial RFC focused people on weak
vs strict typing, while we should have explored other technical concerns.
Here are the main ones I see :

the fact that the RFC supports single types only, like the previous
'return type' RFC. While it is easier to implement, it opens several issues
as multiply-typed arguments are an integral part of the PHP language
(mostly completeness and compatibility with internal function hinting). If
we want to support multiple types the same way for internal and userspace
functions, we must extend the ZPP layer to support it.

this is not a big technical problem.

the mechanism to check for type hints on internal functions, while easy

to implement, is not sufficient, as a lot of internal functions get a bare
zval from the parsing system and then convert it by themselves. With the
proposed mechanism, there's no possible hinting on such argument, which
will make the implementation different from the documentation. Even if the
check is done by the function body, it won't be done in a consistent way
with type hinting checks and won't raise a similar error. As most cases are
related to multiply-typed args, the solution is in adding multiply-typed
support to ZPP. Multiply-typed support needs to redefine scalar conversion
rules, to take care of the target type being a combination of single types.

I wouldn't rise this question. Lets it work as is for now.

We need to define the appropriate extension to Reflection
parameters/return type. That's not complex, but it takes time.

It's a subject for separate more or less obvious RFC.

Other changes I'd like to propose are exposed in Bob Weinand's article,
at https://gist.github.com/bwoebi/b4c5564388ecd004ba96. The article
explains how restricting weak conversion possibilities would make strict
typing almost useless. Changes include forbidding bool to int/float or
'7years' to int. This cannot be left for future additions as BC breaks will
make it impossible. To remain consistent between userspace/internal
functions, this must also be done at the ZPP level.

Lets go forward by small steps.
If we would do it before we would already got all the features.
We may restrict ZPP rules by next RFC. I'll most probably support it.
The same for the rest.

Thanks. Dmitry.

Using bare class names as type hints is a potential issue too, as it
makes reserved keywords and class names share the same naming space. I
think we should deprecate the use of class names as type hints in favor of
'object(class-name)'. If we don't do that, every future addition of a type
hint keyword will cause a BC break (and will be practically impossible).

Additional 'hybrid' types like 'numeric' and 'mixed' should be also
provided.

So, most features I have in mind are really 'now or never'.

My main concern, anyway, is with March 15 announced feature freeze. If we
need a vote by this date, it's impossible. And planning such BC for 7.1 is
probably unrealistic because of the huge syntax additions and BC breaks it
brings. So, if it's too late for an inclusion in 7.0, I think I'll give up.

So, could someone confirm what 'feature freeze' exactly means ?

Regards

François

The main issues are completeness (we can give hints for some cases, but
not for others) and, more important, the compatibility with internal
functions. As Andrea herself agreed, her mechanism for type hinting on
internal functions is not sufficient. Just using the ZPP macros, as they
exist today, won't work as a lot of internal functions get a bare zval and
then convert it by themselves. So, in this case, we would check nothing.
So, an argument described as 'string|array' in the documentation, wouldn't
produce the same sort of error when sent an object, than its friend,
described as 'string'. This is not consistent and will open a lot of side
effects if it is left out of the type hint layer.

10 years ago by francois@php.net — view source

unread

Hi Dmitry

De : Dmitry Stogov [mailto:dmitry@zend.com]

I would propose exactly Andrea's 0.1.
Most people were agree to support weak type hints by default.
This proposal won't prevent feature addition of optional strict type hints.
All are tired from endless arguing.

Yes, but that's not exactly what I had in mind. I thought we could just use it as a base, and add features that, for any reason, will be much harder to implement in the future.

However, most of the work I want to do deals with ZPP macros, which are not so complex. And userspace type hints follow ZPP rules, so ZPP changes will apply to internal and userspace functions automatically. The first thing to do is restricting authorized parsing conversions (ZPP and zend_parse_parameters must matc here) to get approval from strict-typing fans (tan(1) and curl use cases must be solved first). Then, define multi-type conversion rules and implement that in ZPP (just ZPP, not zend_parse_parameters). Once it is available in ZPP, internal functions will migrate from Z_PARAM_ZVAL at their own pace but the mechanism must exist before functions can gradually use it.

I would really like to implement multi-type support now. IMO, it is one of the most useful characteristics I expect from type hinting. Multi-typed argument are natural in PHP from the beginning and should have been part of the RFC from its beginning too. If people start describing multi-type args as 'mixed', which is not the same information, they won't come back and do the work again with more precise types, asking whether each 'mixed' is a 'real' mixed or just a placeholder waiting for us to implement the rest. I am especially concerned with the fact that a good part of userspace function arguments will be just impossible to type, except as 'mixed', which is not accurate in most cases. 3 lines above the argument declaration, a phpdoc comment says that $arg is 'string|array', but the user cannot transmit the information to the engine... I know it may be a question of time but it saddens me.

I propose you implement the parser/userspace part (I don't know what remains to be done) and I write and implement ZPP restrictions and multi-type support. I'll try to define ZPP restrictions (single type) tonight.

I'll also read the 0.1 RFC again and send comments if I have some. I already know that I want to add the 'resource' type because Andrea's argument is fake, IMO (and it would make it the only type without a hint).

We need to define the appropriate extension to Reflection parameters/return type. That's not complex, but it takes time.
It's a subject for separate more or less obvious RFC.

Would feature freeze also apply to such work ?

Before I write anything, what do you think of suppressing conversions to/from IS_NULL in zpp (provided we support multi-typed args, of course ?

Regards

François

10 years ago by Andrey Andreev — view source

unread

Hi,

I would propose exactly Andrea's 0.1.
Most people were agree to support weak type hints by default.
This proposal won't prevent feature addition of optional strict type hints.

Sorry, but I'll have to repeat what has been said over and over again

the 0.1 version did not have overwhelming support and quite a lot of
people aren't OK with (only) weak type hints.

Part of the rationale behind the 0.2+ versions was that it would be
unfair to strict typing supporters if weak is accepted alone. I agree
with that on 100% (although I didn't agree with the proposed solution)
and it's not hard to imagine why ...

Also, IMO the only way a weak-types addition will NOT prevent strict
type-hints in the future is to use the foo((type) $bar) syntax that
was proposed by Anthony Ferrara 2 years ago. The syntax itself being
different from that of already existing class type-hints implies both
that it's a weak hint and that strict typing is expected to be
available in the future (although I see no reason to delay it). In no
other way excluding weird strict/weak modes or a regexp-like syntax
will you be able to add both features.

All are tired from endless arguing.

That's not a reason to force-in a feature until it's clear that it is
what we all (or at least most) really want. I hated seeing so much
people supporting a proposal that they admit to not like, just so we
have something at all - I don't believe that's the right approach,
especially coming from tech people.

Cheers,
Andrey.

10 years ago by Pierre Joye — view source

unread

On Mon, Feb 16, 2015 at 7:42 PM, François Laupretre francois@php.net
wrote:

Hi,

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]

The 0.1 RFC version was mentioned a lot as a good compromise by many
people
and had major support.
Maybe someone competent could pick it up, make necessary adjustments
that
where required and let people vote on it? Start with small steps - get
the
weak type hints into the language first, see how it gets used and
then we
can always add strict type hints if there is a need/desire to do that.

That way we finally get type hints into the language, and those
wanting
the
strict variety have all the opportunities in the world to add them at
a
later release with proper discussion and development time.

That's what I am planning. If I write an RFC, it will be based on
Andrea's
0.1/0.2 version, and won't propose different modes.

I would propose exactly Andrea's 0.1.
Most people were agree to support weak type hints by default.

So a 50/50 vote is most people in favor of one type? Sorry, this is not the
case.

This proposal won't prevent feature addition of optional strict type
hints.
All are tired from endless arguing.

Yes, we are. The only difference is one camp makes compromises and tried to
find solutions. Now, deja vu scenario, we are heading exactly to what you
want because we do not have any other solution. A by default single choice
is not good. This is a very bad move.

10 years ago by pajousek@gmail.com — view source

unread

Hi,

the fact that the RFC supports single types only, like the previous 'return type' RFC. While it is easier to implement, it opens several issues as multiply-typed arguments are an integral part of the PHP language (mostly completeness and compatibility with internal function hinting). If we want to support multiple types the same way for internal and userspace functions, we must extend the ZPP layer to support it.

the mechanism to check for type hints on internal functions, while easy to implement, is not sufficient, as a lot of internal functions get a bare zval from the parsing system and then convert it by themselves. With the proposed mechanism, there's no possible hinting on such argument, which will make the implementation different from the documentation. Even if the check is done by the function body, it won't be done in a consistent way with type hinting checks and won't raise a similar error. As most cases are related to multiply-typed args, the solution is in adding multiply-typed support to ZPP. Multiply-typed support needs to redefine scalar conversion rules, to take care of the target type being a combination of single types.

Hello,

I know this is probably a pretty unpopular opinion in PHP (based on
the replies I got in the other thread), but different values for
parameters should be IMHO solved by method overloading and such.

Regards
Pavel Kouril

10 years ago by francois@php.net — view source

unread

De : Pavel Kouril [mailto:pajousek@gmail.com]

Hello,

I know this is probably a pretty unpopular opinion in PHP (based on
the replies I got in the other thread), but different values for
parameters should be IMHO solved by method overloading and such.

The question is not that it's unpopular, it's that 1. It requires strict types, and 2. If we want to solve this by polymorphism, we must also support polymorphism on functions. Now, tell me how you would solve the tan(int|float) case with polymorphism ? One 'tan' function for int, one for float ? str_replace(string|array, string|array, string|array, int &) is also a nice case to study if you can't sleep...

Sorry but polymorphism on scalar types is possible with strict types only, which is out of scope for the next RFC.

François

10 years ago by pajousek@gmail.com — view source

unread

De : Pavel Kouril [mailto:pajousek@gmail.com]

Hello,

I know this is probably a pretty unpopular opinion in PHP (based on
the replies I got in the other thread), but different values for
parameters should be IMHO solved by method overloading and such.

The question is not that it's unpopular, it's that 1. It requires strict types, and 2. If we want to solve this by polymorphism, we must also support polymorphism on functions. Now, tell me how you would solve the tan(int|float) case with polymorphism ? One 'tan' function for int, one for float ? str_replace(string|array, string|array, string|array, int &) is also a nice case to study if you can't sleep...

Sorry but polymorphism on scalar types is possible with strict types only, which is out of scope for the next RFC.

François

I will get to this more tomorrow, but now just a short notes:

I'd only have tan(float) with int being able to be passed and the
value would be implicitly converted to float. This is something even
strongly typed languages normally do, and there is no reason PHP
shouldn't be able to. If somebody would make a tan(float) and tan(int)
declarations, then when calling tan($x), the apropriate one based on
current type of $x should be called. I don't see any reason why this
can be done only for strongly typed languages; could you name exact
reasons?

The str_replace(string|array $search, string|array $replace,
string|array $subject[, int &$count ]) is a harder one to comment on
though. I will refrain from answering on that one right now and will
get back to that one tomorrow, because that one needs a little bit
more thinking.

Regards
Pavel Kouril

10 years ago by pajousek@gmail.com — view source

unread

De : Pavel Kouril [mailto:pajousek@gmail.com]

Hello,

I know this is probably a pretty unpopular opinion in PHP (based on
the replies I got in the other thread), but different values for
parameters should be IMHO solved by method overloading and such.

The question is not that it's unpopular, it's that 1. It requires strict types, and 2. If we want to solve this by polymorphism, we must also support polymorphism on functions. Now, tell me how you would solve the tan(int|float) case with polymorphism ? One 'tan' function for int, one for float ? str_replace(string|array, string|array, string|array, int &) is also a nice case to study if you can't sleep...

Sorry but polymorphism on scalar types is possible with strict types only, which is out of scope for the next RFC.

François

Hello,

finally got to this, after a while of thinking - the first part was
answered before, so I'll get only to the str_replace now. :)

Basically, each of the 8 different version based on parameters does
different thing, if I'm counting it correctly. So having 8 different
declarations is IMHO valid. I know it sounds a little bit weird, but
if you really realize the amount of things the function is doing based
on the passed parameters, why would having one declaration with
multiple argument types and if statements deciding what to do based on
an argument types be better than 8 different declarations, where each
one does one predictable thing and can be documented separately? :)

PS: I know there's something like a declined RFC for method
overloading in the wiki (actually, just a 7 years old email), but if
there would be someone who would be interested in having method
overloading in PHP and understands PHP's internals (I don't) so he'd
be able to make the implementation, I'd like to discuss this idea with
him more and contribute somehow to a possible RFC.

Regards
Pavel Kouril

10 years ago by jgmdev@gmail.com — view source

unread

2015-02-21 17:26 GMT-04:00 Pavel Kouřil pajousek@gmail.com:

Hello,

finally got to this, after a while of thinking - the first part was
answered before, so I'll get only to the str_replace now. :)

Basically, each of the 8 different version based on parameters does
different thing, if I'm counting it correctly. So having 8 different
declarations is IMHO valid. I know it sounds a little bit weird, but
if you really realize the amount of things the function is doing based
on the passed parameters, why would having one declaration with
multiple argument types and if statements deciding what to do based on
an argument types be better than 8 different declarations, where each
one does one predictable thing and can be documented separately? :)

PS: I know there's something like a declined RFC for method
overloading in the wiki (actually, just a 7 years old email), but if
there would be someone who would be interested in having method
overloading in PHP and understands PHP's internals (I don't) so he'd
be able to make the implementation, I'd like to discuss this idea with
him more and contribute somehow to a possible RFC.

Regards
Pavel Kouril

I recently discovered this compact and lean implementation of PHP (ph7)
which has overloading, scalar hinting and other nice features, so besides
hhvm it could serve as an inspiration for those working on RFC's

github.com/symisc/PH7

Would be nice to have function overloading on PHP now that theres going to
be scalar type hinting (luckily at some point), its the best a proven
approach to deal with different parameter types in a clean and
straightforward way. Also some functionality at the PHP C level (when
developing extensions) to register various overloads would be nice. When
working on the wxWidgets wrapper for PHP I had to mix all function/method
overloads into a single function which isn't too easy to do on an automated
way (github.com/wxphp/wxphp/blob/master/src/media.cpp#L170).

Also when working on an IDE like NetBeans it is a mess to supply an
interface file for auto-completion with methods/functions that have
different overloads. This kind of things may at first look like conflicting
with original PHP goals, but they are pretty useful once you understand
them.

10 years ago by Levi Morrison — view source

unread

Hi,

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]

The 0.1 RFC version was mentioned a lot as a good compromise by many
people
and had major support.
Maybe someone competent could pick it up, make necessary adjustments
that
where required and let people vote on it? Start with small steps - get the
weak type hints into the language first, see how it gets used and then we
can always add strict type hints if there is a need/desire to do that.

That way we finally get type hints into the language, and those wanting the
strict variety have all the opportunities in the world to add them at a
later release with proper discussion and development time.

That's what I am planning. If I write an RFC, it will be based on Andrea's 0.1/0.2 version, and won't propose different modes.

The problem is that the previous controversial RFC focused people on weak vs strict typing, while we should have explored other technical concerns. Here are the main ones I see :

the fact that the RFC supports single types only, like the previous 'return type' RFC. While it is easier to implement, it opens several issues as multiply-typed arguments are an integral part of the PHP language (mostly completeness and compatibility with internal function hinting). If we want to support multiple types the same way for internal and userspace functions, we must extend the ZPP layer to support it.

the mechanism to check for type hints on internal functions, while easy to implement, is not sufficient, as a lot of internal functions get a bare zval from the parsing system and then convert it by themselves. With the proposed mechanism, there's no possible hinting on such argument, which will make the implementation different from the documentation. Even if the check is done by the function body, it won't be done in a consistent way with type hinting checks and won't raise a similar error. As most cases are related to multiply-typed args, the solution is in adding multiply-typed support to ZPP. Multiply-typed support needs to redefine scalar conversion rules, to take care of the target type being a combination of single types.

We need to define the appropriate extension to Reflection parameters/return type. That's not complex, but it takes time.

Other changes I'd like to propose are exposed in Bob Weinand's article, at https://gist.github.com/bwoebi/b4c5564388ecd004ba96. The article explains how restricting weak conversion possibilities would make strict typing almost useless. Changes include forbidding bool to int/float or '7years' to int. This cannot be left for future additions as BC breaks will make it impossible. To remain consistent between userspace/internal functions, this must also be done at the ZPP level.

Using bare class names as type hints is a potential issue too, as it makes reserved keywords and class names share the same naming space. I think we should deprecate the use of class names as type hints in favor of 'object(class-name)'. If we don't do that, every future addition of a type hint keyword will cause a BC break (and will be practically impossible).

Additional 'hybrid' types like 'numeric' and 'mixed' should be also provided.

So, most features I have in mind are really 'now or never'.

My main concern, anyway, is with March 15 announced feature freeze. If we need a vote by this date, it's impossible. And planning such BC for 7.1 is probably unrealistic because of the huge syntax additions and BC breaks it brings. So, if it's too late for an inclusion in 7.0, I think I'll give up.

So, could someone confirm what 'feature freeze' exactly means ?

Regards

François

The main issues are completeness (we can give hints for some cases, but not for others) and, more important, the compatibility with internal functions. As Andrea herself agreed, her mechanism for type hinting on internal functions is not sufficient. Just using the ZPP macros, as they exist today, won't work as a lot of internal functions get a bare zval and then convert it by themselves. So, in this case, we would check nothing. So, an argument described as 'string|array' in the documentation, wouldn't produce the same sort of error when sent an object, than its friend, described as 'string'. This is not consistent and will open a lot of side effects if it is left out of the type hint layer.

I have a sum types RFC that isn't ready for discussion. However, since
you have mentioned the functionality it here is the preliminary RFC:
https://wiki.php.net/rfc/sum_types.

Also, I know people REALLY want scalar types in PHP 7.0 but honestly
all we need to do is reserve the keywords so there is no BC impact and
then we can do it at any point during the PHP 7 lifecycle. This is my
preferred course of action, because right now this internals mailing
list is in HIGH STRESS MODE. I would rather just take the action of
reserving the types so it can be done in 7.1 (or 7.2 or even if it's
never, I would prefer to reserve these words).

10 years ago by francois@php.net — view source

unread

Hi,

De : morrison.levi@gmail.com [mailto:morrison.levi@gmail.com] De la part

Also, I know people REALLY want scalar types in PHP 7.0 but honestly
all we need to do is reserve the keywords so there is no BC impact and
then we can do it at any point during the PHP 7 lifecycle. This is my
preferred course of action, because right now this internals mailing
list is in HIGH STRESS MODE. I would rather just take the action of
reserving the types so it can be done in 7.1 (or 7.2 or even if it's
never, I would prefer to reserve these words).

If we deprecate using bare class names as type hints and replace it with the 'object(classname)' syntax, we can reserve keywords for 7.0 and, maybe 7.1 but potential name clashes should be away in 7.2.

François

10 years ago by Dennis Birkholz — view source

unread

Am 16.02.2015 um 21:39 schrieb François Laupretre:

If we deprecate using bare class names as type hints and replace it with the 'object(classname)' syntax, we can reserve keywords for 7.0 and, maybe 7.1 but potential name clashes should be away in 7.2.

This is a huge BC break and will always clash with existing type hints.
Also I think this will reduce readability and the gain is only very
limit. We should introduce the type hints case-sensitive (lower case),
so people can have their Integer and String classes without a clash.
Most people use upper case class names already.

Greets
Dennis

10 years ago by Philip Sturgeon — view source

unread

Hi,

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]

The 0.1 RFC version was mentioned a lot as a good compromise by many
people
and had major support.
Maybe someone competent could pick it up, make necessary adjustments
that
where required and let people vote on it? Start with small steps - get the
weak type hints into the language first, see how it gets used and then we
can always add strict type hints if there is a need/desire to do that.

That way we finally get type hints into the language, and those wanting the
strict variety have all the opportunities in the world to add them at a
later release with proper discussion and development time.

That's what I am planning. If I write an RFC, it will be based on Andrea's 0.1/0.2 version, and won't propose different modes.

The problem is that the previous controversial RFC focused people on weak vs strict typing, while we should have explored other technical concerns. Here are the main ones I see :

the fact that the RFC supports single types only, like the previous 'return type' RFC. While it is easier to implement, it opens several issues as multiply-typed arguments are an integral part of the PHP language (mostly completeness and compatibility with internal function hinting). If we want to support multiple types the same way for internal and userspace functions, we must extend the ZPP layer to support it.

the mechanism to check for type hints on internal functions, while easy to implement, is not sufficient, as a lot of internal functions get a bare zval from the parsing system and then convert it by themselves. With the proposed mechanism, there's no possible hinting on such argument, which will make the implementation different from the documentation. Even if the check is done by the function body, it won't be done in a consistent way with type hinting checks and won't raise a similar error. As most cases are related to multiply-typed args, the solution is in adding multiply-typed support to ZPP. Multiply-typed support needs to redefine scalar conversion rules, to take care of the target type being a combination of single types.

We need to define the appropriate extension to Reflection parameters/return type. That's not complex, but it takes time.

Other changes I'd like to propose are exposed in Bob Weinand's article, at https://gist.github.com/bwoebi/b4c5564388ecd004ba96. The article explains how restricting weak conversion possibilities would make strict typing almost useless. Changes include forbidding bool to int/float or '7years' to int. This cannot be left for future additions as BC breaks will make it impossible. To remain consistent between userspace/internal functions, this must also be done at the ZPP level.

Using bare class names as type hints is a potential issue too, as it makes reserved keywords and class names share the same naming space. I think we should deprecate the use of class names as type hints in favor of 'object(class-name)'. If we don't do that, every future addition of a type hint keyword will cause a BC break (and will be practically impossible).

Additional 'hybrid' types like 'numeric' and 'mixed' should be also provided.

So, most features I have in mind are really 'now or never'.

My main concern, anyway, is with March 15 announced feature freeze. If we need a vote by this date, it's impossible. And planning such BC for 7.1 is probably unrealistic because of the huge syntax additions and BC breaks it brings. So, if it's too late for an inclusion in 7.0, I think I'll give up.

So, could someone confirm what 'feature freeze' exactly means ?

Regards

François

The main issues are completeness (we can give hints for some cases, but not for others) and, more important, the compatibility with internal functions. As Andrea herself agreed, her mechanism for type hinting on internal functions is not sufficient. Just using the ZPP macros, as they exist today, won't work as a lot of internal functions get a bare zval and then convert it by themselves. So, in this case, we would check nothing. So, an argument described as 'string|array' in the documentation, wouldn't produce the same sort of error when sent an object, than its friend, described as 'string'. This is not consistent and will open a lot of side effects if it is left out of the type hint layer.

--

I know it is very easy for people to say "Well, that v0.3 that I
didn't like has been withdrawn, so let's just crack on and do some
other new thing." but I would have to ask people to consider that v0.3
had two thirds majority, with a few people clearly admitting that
their No vote was down to declare.

So, if declare is the only thing blocking v0.3 from passing, despite
1/3rd of voters being sad about it, let's just go down that road
instead of throwing the baby out with the bath water and coming up
with some new approach that will be bike-shedded over until PHP 8 is
in feature freeze.

10 years ago by padraic.brady@gmail.com — view source

unread

I know it is very easy for people to say "Well, that v0.3 that I
didn't like has been withdrawn, so let's just crack on and do some
other new thing." but I would have to ask people to consider that v0.3
had two thirds majority, with a few people clearly admitting that
their No vote was down to declare.

So, if declare is the only thing blocking v0.3 from passing, despite
1/3rd of voters being sad about it, let's just go down that road
instead of throwing the baby out with the bath water and coming up
with some new approach that will be bike-shedded over until PHP 8 is
in feature freeze.

Hear, hear.

--
Pádraic Brady

http://blog.astrumfutura.com
http://www.survivethedeepend.com
Zend Framework Community Review Team
Zend Framework PHP-FIG Representative

10 years ago by francois@php.net — view source

unread

De : Philip Sturgeon [mailto:pjsturgeon@gmail.com]

I know it is very easy for people to say "Well, that v0.3 that I
didn't like has been withdrawn, so let's just crack on and do some
other new thing.

It's not so easy. It would be easier to do as you suggest. And you can still do it in your name.

Andrea's had agreed we should write a follow-up RFC to solve open issues we had identified. This is what I am proposing here, nothing more.

It is not some 'other new thing'. It's mostly a way to bring strict-typing fans back to consensus. Of course, you may find it useless...

" but I would have to ask people to consider that v0.3
had two thirds majority, with a few people clearly admitting that
their No vote was down to declare.

Even if we got 4 or 5 people change their mind because of a new declare syntax, it is clear that we don't have consensus on this topic. So, IMO, the RFC is dead, whatever 2/3 or 3/4 we may have. Once it was clear that both camps would never agree, with every PHP founders against it, pushing it was useless. We're not electing a president, we're trying to ensure we make the right decision. The spirit is not the same.

So, if declare is the only thing blocking v0.3 from passing, despite
1/3rd of voters being sad about it, let's just go down that road
instead of throwing the baby out with the bath water and coming up
with some new approach that will be bike-shedded over until PHP 8 is
in feature freeze.

I am also writing this because I think the two-mode approach is by far not the best one, as it would probably generate a lot of side effects we don't even imagine now. I voted 'yes' because I thought that was nice to have it in PHP 7 instead of nothing but, now, the RFC is dead and I am free to propose what I have in mind, especially in the hope that it can bring a consensus where it is lacking.

Once again, anyone can take over version 0.3, if it is so great. Why don't you do it ? I will play the game, stop working on my proposal, and vote 'yes' again. But don't ask me to do it in your place.

Regards

François

10 years ago by Dan Ackroyd — view source

unread

it is clear that we don't have consensus on this topic. So, IMO, the RFC is dead, whatever 2/3 or 3/4 we may have.

It's okay for people to disagree about things. And we have voting to
allow us to resolve those disagreements.

Claiming that only things that have near unanimous consensus can be
moved forward is bogus.

cheers
Dan

10 years ago by Sara Golemon — view source

unread

Once again, anyone can take over version 0.3, if it is so great. Why don't you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's needs whether they would admit it or not, and who's only
failing (subjectively termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to
be, but I'm willing to read the winds and modify it for v0.4.

Straw poll:

<?php strict;
<?php-strict
use strict; (psuedo-namespace)
<?php // strict (I don't actually like HHVM's style, but if you do...)
declare(strict=true); (As a top-level declare only)
declare(strict=true); (exactly as in v0.3 -- maybe you liked it)
your write-in vote here

I'm not going to scope in union types, nullables, or falsables. We
can leave that for a followup RFC, this one is contentious enough as
it is.

-Sara

10 years ago by Arvids Godjuks — view source

unread

Might I remind everyone that time is not on our side here - feature freeze
is looming and actual work has to be done.
The part you must understand is: Strict type hints are possible if someone
cares to implement them with a next RFC. Be our guest. Right now we need to
sort out the basic stuff - the missing numeric/mixed/resource hints, the
ability to define mixed hints and make it all consistent. Maybe even
fix/change some of conversion rules as a result (i'm just giving an example
here).

Gives us some time to gather our thought, discuss stuff and do the update
to the RFC. Asuming stuff and pointing fingers before new version is out is
just distracting.

10 years ago by francois@php.net — view source

unread

Arvids,

I’m afraid you’re still more naive than I am. Don’t you understand it’s dead ?

Even before Sara took over 0.3, they decided to revive Andrea’s v 0.1 with no change. The fight will take place between both. Our only right is to enlist in one camp and yell with the mass.

We intended to explore missing hints, union types, ZPP conversions. They just want to talk about the declare() syntax. Do you see the gap ? Oh yes, if you insist, you will be told all your concerns are premature. You will work on that later, when BC makes it impossible, or wait for 8.0.

I was surprised we could have a window to propose something creative, focusing on other concerns than this stupid declare() J. Imagine, we could have proposed something without a declare(). It would have been terrible. All this energy lost arguing about an unneeded directive ! Now, it’s clear. Declare() strikes back !

I know that’s frustrating but I also guess that our work would have been useless as everything would have been rejected. So, it may be better this way. Just enjoy the upcoming show. It has started already.

Regards

François

De : Arvids Godjuks [mailto:arvids.godjuks@gmail.com]
Envoyé : mardi 17 février 2015 01:17
À : Sara Golemon; francois
Cc : Philip Sturgeon; Jefferson Gonzalez; Rowan Collins; PHP internals
Objet : Re: [PHP-DEV] Reviving scalar type hints

Might I remind everyone that time is not on our side here - feature freeze is looming and actual work has to be done.
The part you must understand is: Strict type hints are possible if someone cares to implement them with a next RFC. Be our guest. Right now we need to sort out the basic stuff - the missing numeric/mixed/resource hints, the ability to define mixed hints and make it all consistent. Maybe even fix/change some of conversion rules as a result (i'm just giving an example here).

Gives us some time to gather our thought, discuss stuff and do the update to the RFC. Asuming stuff and pointing fingers before new version is out is just distracting.

10 years ago by Nikita Popov — view source

unread

On Mon, Feb 16, 2015 at 2:50 PM, François Laupretre francois@php.net
wrote:

Once again, anyone can take over version 0.3, if it is so great. Why
don't you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's needs whether they would admit it or not, and who's only
failing (subjectively termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to
be, but I'm willing to read the winds and modify it for v0.4.

Straw poll:

<?php strict;

<?php-strict

use strict; (psuedo-namespace)

<?php // strict (I don't actually like HHVM's style, but if you do...)

declare(strict=true); (As a top-level declare only)

declare(strict=true); (exactly as in v0.3 -- maybe you liked it)

your write-in vote here

I'm not going to scope in union types, nullables, or falsables. We
can leave that for a followup RFC, this one is contentious enough as
it is.

Thank you for taking over.

I like "use strict" and "declare as top-level only" most. The "<?php
strict" variants feel too ad-hoc.

Nikita

10 years ago by Dan Ackroyd — view source

unread

Thank you for taking over.

I like "use strict" and "declare as top-level only" most.

That would be this no vote changed to a yes.

And I'd also like to say thank you Sara for taking over.

cheers
Dan

10 years ago by Stelian Mocanita — view source

unread

Thanks Sara for taking over,

For myself both <?php strict would seal the deal, but use strict; is also
an option I would endorse.

Despite mentioning not scoping union types, I feel like a "numeric" type
would make a lot of sense and bring in more consensus to the list, fixing
the all famous sin() and such issues which I also see as a slight drudgery
to keep casting.

Regards,
Stelian

Thank you for taking over.

I like "use strict" and "declare as top-level only" most.

That would be this no vote changed to a yes.

And I'd also like to say thank you Sara for taking over.

cheers
Dan

10 years ago by Rasmus Lerdorf — view source

unread

Once again, anyone can take over version 0.3, if it is so great. Why don't you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's needs whether they would admit it or not, and who's only
failing (subjectively termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to
be, but I'm willing to read the winds and modify it for v0.4.

I still disagree strongly that it serves everyone's needs. The internal
API and APIs provided by extensions are completely messed up by this
approach. Userspace authors get the choice when they write their code.
Even if the caller has turned on strict, if you haven't added strict
types to your functions/methods you are fine. Internal API and extension
authors don't get this luxury. If the caller turns on strict, then
suddenly an API that was written explicitly to make use of the coercive
characteristics PHP has had for 20+ years breaks and there is nothing I
can do about it as an extension author. This is beyond just int->float
coercion, which is, of course, also missing.

A simple example:

number_format($num);

So, for various inputs without strict enabled:

int(1000)
1,000

string(4) "1000"
1,000

float(1000)
1,000

string(5) "1000 "
Notice: A non well formed numeric value encountered in ...
1,000

string(5) " 1000"
1,000

string(9) "1000 dogs"
Notice: A non well formed numeric value encountered in ...
1,000

string(3) "dog"
Warning: number_format() expects parameter 1 to be float, string given
NULL

resource(5) of type (stream)
Warning: number_format() expects parameter 1 to be float, resource given
NULL

This is the intended API. The function does some sanity checking for
things that clearly don't make sense based on what it was written to do
and fails hard. It also lets the user know about questionable data and
relies on coercion for the rest. You could argue that the "1000 dogs"
case should be more severe than a notice, but that is pretty minor I
think. The extension author has the ability to make that more severe if
she likes.

Now turn on strict and we get:

int(1000)
number_format() expects parameter 1 to be float, integer given

string(4) "1000"
number_format() expects parameter 1 to be float, string given

float(1000)
1,000

string(5) "1000 "
number_format() expects parameter 1 to be float, string given

string(5) " 1000"
number_format() expects parameter 1 to be float, string given

string(9) "1000 dogs"
number_format() expects parameter 1 to be float, string given

string(3) "dog"
number_format() expects parameter 1 to be float, string given

resource(5) of type (stream)
number_format() expects parameter 1 to be float, resource given

That in itself might not be so terrible, but also not terribly useful
and it is nothing like what the extension author had intended. And how
do you think users will deal with internal functions that are now
suddenly strongly typed even though they were not designed to be? Do you
think they will go look at the source code for the function and mimic
the data sanity checks and put those in their userspace code? Doubtful,
and as people on irc and everywhere have told me, they will just cast.
No big deal. So, we run the same set of data through with strict enabled
and doing a number_format((float)$num):

int(1000)
1,000

string(4) "1000"
1,000

float(1000)
1,000

string(5) "1000 "
1,000

string(5) " 1000"
1,000

string(9) "1000 dogs"
1,000

string(3) "dog"
0

resource(5) of type (stream)
5

Out of the 3 scenarios, this is inarguably the worst outcome with the
last line there being the most blatant. Actually outputting an internal
resource id as if it was a valid number to be formatted without the
slightest notice or warning anywhere that something is wrong in the code.

This is my fear with this approach. People will start littering their
code with casts and it will hide real bugs which is the complete
opposite of what motivated this in the first place.

Without more thought into how we properly deal with internal/extension
code I don't really understand how so many people foresee this to work
perfectly in the real world.

-Rasmus

10 years ago by Drew Paroski — view source

unread

Once again, anyone can take over version 0.3, if it is so great. Why don't you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's needs whether they would admit it or not, and who's only
failing (subjectively termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to
be, but I'm willing to read the winds and modify it for v0.4.

I still disagree strongly that it serves everyone's needs. The internal
API and APIs provided by extensions are completely messed up by this
approach. Userspace authors get the choice when they write their code.
Even if the caller has turned on strict, if you haven't added strict
types to your functions/methods you are fine. Internal API and extension
authors don't get this luxury. If the caller turns on strict, then
suddenly an API that was written explicitly to make use of the coercive
characteristics PHP has had for 20+ years breaks and there is nothing I
can do about it as an extension author. This is beyond just int->float
coercion, which is, of course, also missing.

A simple example:

number_format($num);

So, for various inputs without strict enabled:

int(1000)
1,000

string(4) "1000"
1,000

float(1000)
1,000

string(5) "1000 "
Notice: A non well formed numeric value encountered in ...
1,000

string(5) " 1000"
1,000

string(9) "1000 dogs"
Notice: A non well formed numeric value encountered in ...
1,000

string(3) "dog"
Warning: number_format() expects parameter 1 to be float, string given
NULL

resource(5) of type (stream)
Warning: number_format() expects parameter 1 to be float, resource given
NULL

This is the intended API. The function does some sanity checking for
things that clearly don't make sense based on what it was written to do
and fails hard. It also lets the user know about questionable data and
relies on coercion for the rest. You could argue that the "1000 dogs"
case should be more severe than a notice, but that is pretty minor I
think. The extension author has the ability to make that more severe if
she likes.

Now turn on strict and we get:

int(1000)
number_format() expects parameter 1 to be float, integer given

string(4) "1000"
number_format() expects parameter 1 to be float, string given

float(1000)
1,000

string(5) "1000 "
number_format() expects parameter 1 to be float, string given

string(5) " 1000"
number_format() expects parameter 1 to be float, string given

string(9) "1000 dogs"
number_format() expects parameter 1 to be float, string given

string(3) "dog"
number_format() expects parameter 1 to be float, string given

resource(5) of type (stream)
number_format() expects parameter 1 to be float, resource given

That in itself might not be so terrible, but also not terribly useful
and it is nothing like what the extension author had intended. And how
do you think users will deal with internal functions that are now
suddenly strongly typed even though they were not designed to be? Do you
think they will go look at the source code for the function and mimic
the data sanity checks and put those in their userspace code? Doubtful,
and as people on irc and everywhere have told me, they will just cast.
No big deal. So, we run the same set of data through with strict enabled
and doing a number_format((float)$num):

int(1000)
1,000

string(4) "1000"
1,000

float(1000)
1,000

string(5) "1000 "
1,000

string(5) " 1000"
1,000

string(9) "1000 dogs"
1,000

string(3) "dog"
0

resource(5) of type (stream)
5

Out of the 3 scenarios, this is inarguably the worst outcome with the
last line there being the most blatant. Actually outputting an internal
resource id as if it was a valid number to be formatted without the
slightest notice or warning anywhere that something is wrong in the code.

This is my fear with this approach. People will start littering their
code with casts and it will hide real bugs which is the complete
opposite of what motivated this in the first place.

Without more thought into how we properly deal with internal/extension
code I don't really understand how so many people foresee this to work
perfectly in the real world.

-Rasmus

I think you bring up a good point, Rasmus. Extension implementations
have existed for many years in a world before user-land scalar
typehints. I can see how the author of an extension might have
written their code in a way that relies on assumptions about how
argument coercion behaves.

What would you say if zend_parse_parameters() API and friends were
left alone, and a new flag or new versions of these APIs were
introduced so that extension authors could gradually adopt the new
parameter checking/coercion scheme at their leisure?

When adopting the new parameter checking/coercion scheme, if an
author needs/wants to do something different for parameter checking
or coercion for a given extension, it occurred to me that they could
just omit the typehint (i.e. use "z" for zend_parse_parameters())
and then do manual checking inside the function's body. I've seen
this technique used in a number of extensions (and in pure PHP
functions) and it doesn't seem to cause significant problems AFAICT.

I know this isn't a fully baked solution to the issue you raise, but
I'm curious to hear your thoughts about this approach.

Regards,
Drew Paroski

10 years ago by Sara Golemon — view source

unread

I still disagree strongly that it serves everyone's needs. The internal
API and APIs provided by extensions are completely messed up by this
approach. Userspace authors get the choice when they write their code.
Even if the caller has turned on strict, if you haven't added strict
types to your functions/methods you are fine. Internal API and extension
authors don't get this luxury. If the caller turns on strict, then
suddenly an API that was written explicitly to make use of the coercive
characteristics PHP has had for 20+ years breaks and there is nothing I
can do about it as an extension author. This is beyond just int->float
coercion, which is, of course, also missing.

It's not your problem how a script author choses to use your function.
Is it a bug in PHP that currently I can call number_format() with an
object? No. number_format() was written to take a float, and accepts
ints and numeric strings as a byproduct.

Adding the option on a per-file basis to make that call strict doesn't
change the intent or the usage of number_format(). number_format()
still gets a float and still acts on that float in the same way. All
that changes is whether passing an int or a numeric string is
considered "wrong" or not.

We can sigh and tut about this not being "the PHP way", but the script
author was the one who chose to enter into a tight contract, and the
script author, not you, is the one who should have that authority over
their own application.

And how do you think users will deal with internal functions that are now
suddenly strongly typed even though they were not designed to be?

I think they'll deal with it by dint of having /chosen/ to turn on
strict typing. It's not as though an upgrade to PHP 7.12 suddenly
made all their code strict without warning. In fact, nothing will
have changed for them at all UNLESS THEY EXPLICITLY ASK FOR IT.

Do you think they will go look at the source code for the function and mimic
the data sanity checks and put those in their userspace code?

No, they'll look at their own code and make a more informed decision
than any extension author can make for them.

Out of the 3 scenarios, this is inarguably the worst outcome with the
last line there being the most blatant. Actually outputting an internal
resource id as if it was a valid number to be formatted without the
slightest notice or warning anywhere that something is wrong in the code.

Again, you're assuming a single method call in a vacuum. Is it
possible that the kind of script author who is turning on strict
typing might, in fact, be strict in the rest of their application?
Might, in fact, not try to pass a resource into a function expecting
float because they have the tools of strict typing available to them?

This is my fear with this approach. People will start littering their
code with casts and it will hide real bugs which is the complete
opposite of what motivated this in the first place.

So let's talk compromise.
Would leaving internal functions out of the picture at this stage
change you mind? This is effectively what Hack does, internal
functions are explicitly marked as "coercible".
Would a tri-state option make sense? ('weak-all',
'strict-user/weak-internal', 'strict-all')
How do we get from here to something you would like?

Without more thought into how we properly deal with internal/extension
code I don't really understand how so many people foresee this to work
perfectly in the real world.

<kidding>This would be an opportune time for me to quote you as saying
you're really not that into languages. :)</kidding>

As I said above, I foresee this working well in the real world because
nobody is acting in a vacuum, there's an entire application's worth of
type-checked code going into it.

-Sara

10 years ago by Rasmus Lerdorf — view source

unread

I still disagree strongly that it serves everyone's needs. The internal
API and APIs provided by extensions are completely messed up by this
approach. Userspace authors get the choice when they write their code.
Even if the caller has turned on strict, if you haven't added strict
types to your functions/methods you are fine. Internal API and extension
authors don't get this luxury. If the caller turns on strict, then
suddenly an API that was written explicitly to make use of the coercive
characteristics PHP has had for 20+ years breaks and there is nothing I
can do about it as an extension author. This is beyond just int->float
coercion, which is, of course, also missing.

It's not your problem how a script author choses to use your function.

I would very much like it to be my problem how my API is exposed to a
user. At the very least I should have as much control over an API
written in C as one written in PHP. Currently with this RFC it is not
the case and worse, it is retroactively changing an existing set of
functions to be all-strict when this mode is turned on. Even with strict
mode enabled, userspace is not retroactively turned strict and you can
slowly transition your API to be strict by adding appropriate strict
types in a controlled fashion.

And this fact is quickly glossed over to the point where I believe a lot
of people didn't even realize that they were voting for a retroactive
all-strict internal API. But hopefully I am wrong on that point.

We can sigh and tut about this not being "the PHP way", but the script
author was the one who chose to enter into a tight contract, and the
script author, not you, is the one who should have that authority over
their own application.

I find this view way too extreme. Not everything is this black and white
in the real world. We have to be really careful that we don't give the
illusion of better while we make things worse for people. In the real
world, you know for a fact that people are going to force-cast stuff and
they aren't necessarily going to be very careful about it.

The illusion being strict is better than weak-coercive, so let's turn on
strict everywhere. Oh, damn, some stuff broke. Throw in some casts,
things work and hey we are now safer because we are strict. Where in
reality "strict+force cast" tends to be worse than "weak coercive" as
per the number_format((float)"dog") example.

And how do you think users will deal with internal functions that are now
suddenly strongly typed even though they were not designed to be?

I think they'll deal with it by dint of having /chosen/ to turn on
strict typing. It's not as though an upgrade to PHP 7.12 suddenly
made all their code strict without warning. In fact, nothing will
have changed for them at all UNLESS THEY EXPLICITLY ASK FOR IT.

Right, but they will want to start using strict, or else what are we
even talking about? And as soon as they do that string that comes back
from their ORM and passed to number_format() now needs to be dealt with
somehow. I'll bet you a Tim Horton's doughnut that a depressingly high
percentage of people will simply toss a (float) cast on it and move on
and by doing so they have made their application worse.

Again, you're assuming a single method call in a vacuum. Is it
possible that the kind of script author who is turning on strict
typing might, in fact, be strict in the rest of their application?
Might, in fact, not try to pass a resource into a function expecting
float because they have the tools of strict typing available to them?

That's a lot of assumptions. Library functions get data from all sorts
of edges that are outside of the scope of simple type checking on
function arguments. I'd rather not encourage incorrect behaviour here by
retroactively changing the API like that.

So let's talk compromise.
Would leaving internal functions out of the picture at this stage
change you mind? This is effectively what Hack does, internal
functions are explicitly marked as "coercible".
Would a tri-state option make sense? ('weak-all',
'strict-user/weak-internal', 'strict-all')
How do we get from here to something you would like?

So in Hack you didn't think it was a good idea to change the internal
and extension api either? Was the reasoning similar to mine? Did you
agree with the reason then but not now?

-Rasmus

10 years ago by Sara Golemon — view source

unread

I would very much like it to be my problem how my API is exposed to a
user. At the very least I should have as much control over an API
written in C as one written in PHP.

And you have that control. You expose number_format() as taking a
float, and it does, in weak and strict modes alike.

Currently with this RFC it is not
the case and worse, it is retroactively changing an existing set of
functions to be all-strict when this mode is turned on. Even with strict
mode enabled, userspace is not retroactively turned strict and you can
slowly transition your API to be strict by adding appropriate strict
types in a controlled fashion.

That's a dubious distinction. The "special" treatment of internal
functions is due to them having types defined. Of course userspace
functions don't become immediately strict since they don't
automatically have types.

And this fact is quickly glossed over to the point where I believe a lot
of people didn't even realize that they were voting for a retroactive
all-strict internal API. But hopefully I am wrong on that point.

It's presumptive to assume you know what people are basing their votes
on. You're better than that.

We can sigh and tut about this not being "the PHP way", but the script
author was the one who chose to enter into a tight contract, and the
script author, not you, is the one who should have that authority over
their own application.

I find this view way too extreme.

You find giving authority over an application to the application
author too extreme?

Not everything is this black and white in the real world.
We have to be really careful that we don't give the
illusion of better while we make things worse for people. In the real
world, you know for a fact that people are going to force-cast stuff and
they aren't necessarily going to be very careful about it.

THIS point I agree with in principle. Some developers are going to
misuse this feature. That's true of every feature. It was true of
goto, it's true of the input filter extension, it's true of PDO
prepared statements. People use stuff the wrong way. That doesn't
mean we should sacrifice functionality for everyone.

Right, but they will want to start using strict, or else what are we
even talking about? And as soon as they do that string that comes back
from their ORM and passed to number_format() now needs to be dealt with
somehow. I'll bet you a Tim Horton's doughnut that a depressingly high
percentage of people will simply toss a (float) cast on it and move on
and by doing so they have made their application worse.

And a hearteningly high percentage will use it the right way. I'll
buy you that doughnut and have one myself.

Again, you're assuming a single method call in a vacuum. Is it
possible that the kind of script author who is turning on strict
typing might, in fact, be strict in the rest of their application?
Might, in fact, not try to pass a resource into a function expecting
float because they have the tools of strict typing available to them?

That's a lot of assumptions.

It's one assumption. That there exists a population of PHP developers
who actually know how to program. I'm sorry you don't think that's
the case.

So let's talk compromise.
Would leaving internal functions out of the picture at this stage
change you mind? This is effectively what Hack does, internal
functions are explicitly marked as "coercible".
Would a tri-state option make sense? ('weak-all',
'strict-user/weak-internal', 'strict-all')
How do we get from here to something you would like?

So in Hack you didn't think it was a good idea to change the internal
and extension api either? Was the reasoning similar to mine? Did you
agree with the reason then but not now?

First, I'd like to note that you seem to be ignoring my question.
** How do we get from here to something you would like?

Second, I should clarify that while the HHVM runtime performs
coersion, the hack type checker is strict. So my original statement
was inaccurate. As far as hack is concerned, it's simply strict.
Period.

Third, Not everything done in hack is in line with what I would
like. Don't mistake me for hack.

-Sara

10 years ago by Rasmus Lerdorf — view source

unread

Second, I should clarify that while the HHVM runtime performs
coersion, the hack type checker is strict. So my original statement
was inaccurate. As far as hack is concerned, it's simply strict.
Period.

With both the default (partial) type checking and strict enabled, my
number_format() example in Hack produces:

int(1000)
1,000

string(4) "1000"
1,000

float(1000)
1,000

string(5) "1000 "
Warning: number_format() expects parameter 1 to be double, string given

string(5) " 1000"
1,000

string(9) "1000 dogs"
Warning: number_format() expects parameter 1 to be double, string given

string(3) "dog"
Warning: number_format() expects parameter 1 to be double, string given

resource(4) of type (stream)
Warning: number_format() expects parameter 1 to be double, resource given

Basically it accepts, ints, floats and well-formed numeric strings and
the hh_client type checker is telling me I have "No errors". So the only
difference between Hack's strict mode and the current coercive behaviour
in PHP is strings with trailing chars. The "1000 dogs" case. "1000 " as
well in my example, but that is the same case. Where in PHP you get a
notice but it still does the conversion and in Hack you get a warning
and the conversion isn't done. So even though Hack has both a "partial"
type checking mode and a "strict" mode, the decision was to still do
type coercion for the others. I kind of expected it to only accept a
float in full-on strict mode to mimic the no-compromise strictness
proposed in the RFC.

Also, looking through the code, I really don't see this "simply strict"
anywhere when it comes to calling internal functions. For example:

$a = [1,2,3,4,5];
print_r(array_reverse($a,"0"));

It doesn't complain that "0" is a string and not a boolean. It doesn't
even complain about "dog" there.

And the one everyone gets uppity about. bool->int conversion in
curl_setopt(). eg.

$ch = curl_init("https://74.125.28.104");
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
echo curl_exec($ch);
echo curl_error($ch);

PHP obviously converts true to 1 there which has been a problem because
what people really meant was to set it to 2. We spew a notice for this,
of course:

Notice: curl_setopt(): CURLOPT_SSL_VERIFYHOST with value 1 is deprecated
and will be removed as of libcurl 7.28.1. It is recommended to use value
2 instead in ...

In Hack it appears that true is also converted to 1 in <?hh // strict
mode and no notice appears and the hh_client type checker doesn't
complain. If instead of true I pass in an array of strings, it still
converts it to 1 even though the type is blatantly wrong. It looks like
it was kept quite loose to match PHP and not cause too much legacy code
to break. In this particular case it is pretty dangerous to be
completely silent about it though since it actually means no host
verification is getting done. The output when properly set to 2 from
both PHP and Hack is:

SSL: certificate subject name 'www.google.com' does not match target
host name '74.125.28.104'

Please correct me here if I somehow ran these incorrectly. I did put
some deliberate type errors into my userspace code and hh_client caught
those nicely, so it seems like it was working, but it didn't catch
anything when it came to calling the internal API functions.

eg.

<?hh // strict
function test() : int {
$ch = curl_init("https://74.125.28.104";);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, ["I have no idea what I am
doing"]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
echo curl_exec($ch);
echo curl_error($ch);
return "beer";
}

hh_client reports:
/home/rasmus/test/a.php:8:12,17: Invalid return type (Typing[4110])
/home/rasmus/test/a.php:2:19,21: This is an int
/home/rasmus/test/a.php:8:12,17: It is incompatible with a string

When I change return "beer" to return 1 hh_client is happy.

So, you keep asking what I would support. I would like to see an RFC
along the following lines:

Tighten up the type coercion for the "1000 dogs" case although we
have to look at whether there is a problem with some database APIs
returning space-padded fields so "1000 " would now break.
Hopefully that is fringe enough to not break the world.

2a. In strict mode, tone down the strictness and allow non-lossy
coercion including int->float. And yes, I know in really edge cases
that isn't technically non-lossy, but for all practical purposes it
is.

2b. A much more flexible system for specifying multiple types. I should
be able to say that my function takes something that looks like a
number if I choose and still take advantage of stricter typing for
other parameters.

Don't turn on crazy-strict mode for internal functions that weren't
designed for that. Instead provide the same ability as userspace gets
for developers to gradually design their APIs to be stricter if they
so desire allowing both Hack and PHP to implement a stricter
curl_setopt(), for example.

-Rasmus

10 years ago by Benjamin Eberlei — view source

unread

Second, I should clarify that while the HHVM runtime performs
coersion, the hack type checker is strict. So my original statement
was inaccurate. As far as hack is concerned, it's simply strict.
Period.

With both the default (partial) type checking and strict enabled, my
number_format() example in Hack produces:

int(1000)
1,000

string(4) "1000"
1,000

float(1000)
1,000

string(5) "1000 "
Warning: number_format() expects parameter 1 to be double, string given

string(5) " 1000"
1,000

string(9) "1000 dogs"
Warning: number_format() expects parameter 1 to be double, string given

string(3) "dog"
Warning: number_format() expects parameter 1 to be double, string given

resource(4) of type (stream)
Warning: number_format() expects parameter 1 to be double, resource given

Basically it accepts, ints, floats and well-formed numeric strings and
the hh_client type checker is telling me I have "No errors". So the only
difference between Hack's strict mode and the current coercive behaviour
in PHP is strings with trailing chars. The "1000 dogs" case. "1000 " as
well in my example, but that is the same case. Where in PHP you get a
notice but it still does the conversion and in Hack you get a warning
and the conversion isn't done. So even though Hack has both a "partial"
type checking mode and a "strict" mode, the decision was to still do
type coercion for the others. I kind of expected it to only accept a
float in full-on strict mode to mimic the no-compromise strictness
proposed in the RFC.

Also, looking through the code, I really don't see this "simply strict"
anywhere when it comes to calling internal functions. For example:
$a = [1,2,3,4,5];
print_r(array_reverse($a,"0"));
It doesn't complain that "0" is a string and not a boolean. It doesn't
even complain about "dog" there.

And the one everyone gets uppity about. bool->int conversion in
curl_setopt(). eg.
$ch = curl_init("https://74.125.28.104");
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
echo curl_exec($ch);
echo curl_error($ch);
PHP obviously converts true to 1 there which has been a problem because
what people really meant was to set it to 2. We spew a notice for this,
of course:

Notice: curl_setopt(): CURLOPT_SSL_VERIFYHOST with value 1 is deprecated
and will be removed as of libcurl 7.28.1. It is recommended to use value
2 instead in ...

I think curl_setopt is a misleading example in the typehinting discussion,
because
this kind of API does not benefit from it. The third argument depends
on the second argument and requires a "generic" type in code:

curl_setopt(resource $ch, int $option, mixed $data);

It won't be possible to change this (or any similar API) with strict type
hints.

The code to convert a boolean $data to integer in the VERIFYPEER case is
manually
implemented and therefore subject to the implementors design decisions.

In Hack it appears that true is also converted to 1 in <?hh // strict
mode and no notice appears and the hh_client type checker doesn't
complain. If instead of true I pass in an array of strings, it still
converts it to 1 even though the type is blatantly wrong. It looks like
it was kept quite loose to match PHP and not cause too much legacy code
to break. In this particular case it is pretty dangerous to be
completely silent about it though since it actually means no host
verification is getting done. The output when properly set to 2 from
both PHP and Hack is:

SSL: certificate subject name 'www.google.com' does not match target
host name '74.125.28.104'

Please correct me here if I somehow ran these incorrectly. I did put
some deliberate type errors into my userspace code and hh_client caught
those nicely, so it seems like it was working, but it didn't catch
anything when it came to calling the internal API functions.

eg.

<?hh // strict
function test() : int {
$ch = curl_init("https://74.125.28.104";);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, ["I have no idea what I am
doing"]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
echo curl_exec($ch);
echo curl_error($ch);
return "beer";
}

hh_client reports:
/home/rasmus/test/a.php:8:12,17: Invalid return type (Typing[4110])
/home/rasmus/test/a.php:2:19,21: This is an int
/home/rasmus/test/a.php:8:12,17: It is incompatible with a string

When I change return "beer" to return 1 hh_client is happy.

So, you keep asking what I would support. I would like to see an RFC
along the following lines:

Tighten up the type coercion for the "1000 dogs" case although we
have to look at whether there is a problem with some database APIs
returning space-padded fields so "1000 " would now break.
Hopefully that is fringe enough to not break the world.

2a. In strict mode, tone down the strictness and allow non-lossy
coercion including int->float. And yes, I know in really edge cases
that isn't technically non-lossy, but for all practical purposes it
is.

or

2b. A much more flexible system for specifying multiple types. I should
be able to say that my function takes something that looks like a
number if I choose and still take advantage of stricter typing for
other parameters.

Don't turn on crazy-strict mode for internal functions that weren't
designed for that. Instead provide the same ability as userspace gets
for developers to gradually design their APIs to be stricter if they
so desire allowing both Hack and PHP to implement a stricter
curl_setopt(), for example.

-Rasmus

10 years ago by Rasmus Lerdorf — view source

unread

I think curl_setopt is a misleading example in the typehinting
discussion, because
this kind of API does not benefit from it. The third argument depends
on the second argument and requires a "generic" type in code:

curl_setopt(resource $ch, int $option, mixed $data);

It won't be possible to change this (or any similar API) with strict
type hints.

The code to convert a boolean $data to integer in the VERIFYPEER case is
manually
implemented and therefore subject to the implementors design decisions.

Sure, I realize this, but it is the bool->int coercion example that is
always brought up. A static analysis type checker would have trouble
catching this, but both PHP and Hack apply that coercion at runtime:

   case CURLOPT_SSL_VERIFYHOST:
       if(Z_BVAL_PP(zvalue) == 1) {

The runtime could say, hey, I am in strict mode here and you are passing
me an array of strings which I am coercing to a boolean. Not cool.

-Rasmus

10 years ago by Sara Golemon — view source

unread

Second, I should clarify that while the HHVM runtime performs
coersion, the hack type checker is strict. So my original statement
was inaccurate. As far as hack is concerned, it's simply strict.
Period.

With both the default (partial) type checking and strict enabled, my
number_format() example in Hack produces:

...

Please correct me here if I somehow ran these incorrectly. I did put
some deliberate type errors into my userspace code and hh_client caught
those nicely, so it seems like it was working, but it didn't catch
anything when it came to calling the internal API functions.

The mechanisms are strict, but the definitions, in hack, are untyped,
so there's nothing to validate:

hphp/hack/hhi/stdlib/builtins_string.hhi:
function number_format($number, $decimals = 0, $dec_point = ".",
$thousands_sep = ",");

We left a lot of stuff untyped from hack's point of view precisely
because so much of PHP's APIs are non-sensical. Have you looked at
what chr() does with bad types lately? Yikes.

So, you keep asking what I would support. I would like to see an RFC
along the following lines:

Tighten up the type coercion for the "1000 dogs" case although we
have to look at whether there is a problem with some database APIs
returning space-padded fields so "1000 " would now break.
Hopefully that is fringe enough to not break the world.

Hopefully, though I think that we could embrace the idea of trailing
space as insignificant.

2a. In strict mode, tone down the strictness and allow non-lossy
coercion including int->float. And yes, I know in really edge cases
that isn't technically non-lossy, but for all practical purposes it
is.

Nod. Ze'ev called for this too.

2b. A much more flexible system for specifying multiple types. I should
be able to say that my function takes something that looks like a
number if I choose and still take advantage of stricter typing for
other parameters.

Union types. I'm hear a lot of support for this concept, and not
exclusively from one camp.
Perhaps with a psuedo-type defined somewhere to account for the
half-type "numeric string".

Don't turn on crazy-strict mode for internal functions that weren't
designed for that. Instead provide the same ability as userspace gets
for developers to gradually design their APIs to be stricter if they
so desire allowing both Hack and PHP to implement a stricter
curl_setopt(), for example.

Perhaps a ZEND_ACC_STRICT flag which lets an API opt-in to strict mode?
Or something passed to the arg_info struct? The details are secondary,
but you get my meaning...

-Sara

10 years ago by Benjamin Eberlei — view source

unread

On Tue, Feb 17, 2015 at 12:22 AM, Rasmus Lerdorf rasmus@lerdorf.com
wrote:

Second, I should clarify that while the HHVM runtime performs
coersion, the hack type checker is strict. So my original statement
was inaccurate. As far as hack is concerned, it's simply strict.
Period.

With both the default (partial) type checking and strict enabled, my
number_format() example in Hack produces:

...

Please correct me here if I somehow ran these incorrectly. I did put
some deliberate type errors into my userspace code and hh_client caught
those nicely, so it seems like it was working, but it didn't catch
anything when it came to calling the internal API functions.

The mechanisms are strict, but the definitions, in hack, are untyped,
so there's nothing to validate:

hphp/hack/hhi/stdlib/builtins_string.hhi:
function number_format($number, $decimals = 0, $dec_point = ".",
$thousands_sep = ",");

We left a lot of stuff untyped from hack's point of view precisely
because so much of PHP's APIs are non-sensical. Have you looked at
what chr() does with bad types lately? Yikes.

Wait, so Hack is actually only treating userland functions strict (and
maybe the occasional internal function).

This approach would immediately fix all the number_format, sin, tan
problems, but again be rejected
by static propononets for not being complete for analysis. We can run this
circle for another time :-)

So, you keep asking what I would support. I would like to see an RFC
along the following lines:

Tighten up the type coercion for the "1000 dogs" case although we
have to look at whether there is a problem with some database APIs
returning space-padded fields so "1000 " would now break.
Hopefully that is fringe enough to not break the world.

Hopefully, though I think that we could embrace the idea of trailing
space as insignificant.

2a. In strict mode, tone down the strictness and allow non-lossy
coercion including int->float. And yes, I know in really edge cases
that isn't technically non-lossy, but for all practical purposes it
is.

Nod. Ze'ev called for this too.

2b. A much more flexible system for specifying multiple types. I should
be able to say that my function takes something that looks like a
number if I choose and still take advantage of stricter typing for
other parameters.

Union types. I'm hear a lot of support for this concept, and not
exclusively from one camp.
Perhaps with a psuedo-type defined somewhere to account for the
half-type "numeric string".

Don't turn on crazy-strict mode for internal functions that weren't
designed for that. Instead provide the same ability as userspace gets
for developers to gradually design their APIs to be stricter if they
so desire allowing both Hack and PHP to implement a stricter
curl_setopt(), for example.

Perhaps a ZEND_ACC_STRICT flag which lets an API opt-in to strict mode?
Or something passed to the arg_info struct? The details are secondary,
but you get my meaning...

-Sara

10 years ago by Rasmus Lerdorf — view source

unread

Please correct me here if I somehow ran these incorrectly. I did put
some deliberate type errors into my userspace code and hh_client caught
those nicely, so it seems like it was working, but it didn't catch
anything when it came to calling the internal API functions.

The mechanisms are strict, but the definitions, in hack, are untyped,
so there's nothing to validate:

hphp/hack/hhi/stdlib/builtins_string.hhi:
function number_format($number, $decimals = 0, $dec_point = ".",
$thousands_sep = ",");

Right, so most of the internal API functions were omitted from strict
typing in Hack it looks like except for some places where it made sense
to selectively apply stricter checks. The RFC as it stands doesn't give
us this option which is my major problem with it.

Perhaps a ZEND_ACC_STRICT flag which lets an API opt-in to strict mode?
Or something passed to the arg_info struct? The details are secondary,
but you get my meaning...

Yes, something along those lines to allow gradual and selective strictness.

The internal/extension api is just another library and the authors of
these library functions should have the same allowance as userspace
library authors. Like you said in one reply, "What's yours is yours,
what's theirs is theirs."

-Rasmus

10 years ago by francois@php.net — view source

unread

De : php@golemon.com [mailto:php@golemon.com] De la part de Sara
Golemon

Tighten up the type coercion for the "1000 dogs" case although we
have to look at whether there is a problem with some database APIs
returning space-padded fields so "1000 " would now break.
Hopefully that is fringe enough to not break the world.

Hopefully, though I think that we could embrace the idea of trailing
space as insignificant.

I'll propose leading and trailing whitespaces :) This doesn't cost much and can prove useful.

2b. A much more flexible system for specifying multiple types. I should
be able to say that my function takes something that looks like a
number if I choose and still take advantage of stricter typing for
other parameters.

Union types. I'm hear a lot of support for this concept, and not
exclusively from one camp.
Perhaps with a psuedo-type defined somewhere to account for the
half-type "numeric string".

'numeric' can be implemented in two ways : a union type, or a new zpp type. I think I prefer the flexibility of union types. Maybe we'll have to include it in the first release, finally :)

Don't turn on crazy-strict mode for internal functions that weren't
designed for that. Instead provide the same ability as userspace gets
for developers to gradually design their APIs to be stricter if they
so desire allowing both Hack and PHP to implement a stricter
curl_setopt(), for example.

Perhaps a ZEND_ACC_STRICT flag which lets an API opt-in to strict mode?
Or something passed to the arg_info struct? The details are secondary,
but you get my meaning...

I prefer defining four new 'strict' ZPP types for int, float, bool, and string (others are already strict). This way, the function decides what it accepts, not the user. Anyway, if we implement union types as strict-only, we will need these types.

Regards

François

10 years ago by Leigh — view source

unread

We can sigh and tut about this not being "the PHP way", but the script
author was the one who chose to enter into a tight contract, and the
script author, not you, is the one who should have that authority over
their own application.

I find this view way too extreme.

You find giving authority over an application to the application
author too extreme?

And you find taking authority over a library away from the library
author completely acceptable?

If I write an API that works perfectly well in strict mode, why
shouldn't I be able to turn strict on for my whole library? Do I just
tell users that non-strict mode constitutes undefined behavior for
this library, and refuse to fix any bugs that come up because of it?

I'm sure I could find a way of detecting non-strict mode and throw a
fatal, or force access through a facade/wrapper of some sort where
I've turned on strict and made myself the caller. Isn't this equally
unhelpful? The point is some people will want strict turned on, and
they will find ways to force it on people. You're going to have to
live with it, so just make it a possibility from the outset.

10 years ago by Andrey Andreev — view source

unread

We can sigh and tut about this not being "the PHP way", but the script
author was the one who chose to enter into a tight contract, and the
script author, not you, is the one who should have that authority over
their own application.

I find this view way too extreme.

You find giving authority over an application to the application
author too extreme?

And you find taking authority over a library away from the library
author completely acceptable?

If I write an API that works perfectly well in strict mode, why
shouldn't I be able to turn strict on for my whole library? Do I just
tell users that non-strict mode constitutes undefined behavior for
this library, and refuse to fix any bugs that come up because of it?

I'm sure I could find a way of detecting non-strict mode and throw a
fatal, or force access through a facade/wrapper of some sort where
I've turned on strict and made myself the caller. Isn't this equally
unhelpful? The point is some people will want strict turned on, and
they will find ways to force it on people. You're going to have to
live with it, so just make it a possibility from the outset.

^ That. I've said the same thing multiple times already.

Cheers,
Andrey.

10 years ago by Dennis Birkholz — view source

unread

Am 17.02.2015 um 12:30 schrieb Leigh:

And you find taking authority over a library away from the library
author completely acceptable?

If I write an API that works perfectly well in strict mode, why
shouldn't I be able to turn strict on for my whole library? Do I just
tell users that non-strict mode constitutes undefined behavior for
this library, and refuse to fix any bugs that come up because of it?

As the library author you will never ever notice if your library was
called in strict mode or not! And that is the point: you will not get
any gain from making all your parameters strict, you will just force the
user to cast (as Rasmus said already).

Repeating that strict mode is required from a library author's point of
view does not make it right. You always get the types you want, you just
limit the library consumer.

But you may want your code call other functions in strict mode to catch
some type errors, that is perfectly valid, I don't deny that.

Thanks
Dennis

10 years ago by Andrey Andreev — view source

unread

Hi,

Am 17.02.2015 um 12:30 schrieb Leigh:

And you find taking authority over a library away from the library
author completely acceptable?

If I write an API that works perfectly well in strict mode, why
shouldn't I be able to turn strict on for my whole library? Do I just
tell users that non-strict mode constitutes undefined behavior for
this library, and refuse to fix any bugs that come up because of it?

As the library author you will never ever notice if your library was
called in strict mode or not! And that is the point: you will not get
any gain from making all your parameters strict, you will just force the
user to cast (as Rasmus said already).

Repeating that strict mode is required from a library author's point of
view does not make it right. You always get the types you want, you just
limit the library consumer.

But you may want your code call other functions in strict mode to catch
some type errors, that is perfectly valid, I don't deny that.

What you've said has been repeat tens of times already. Many of us
just disagree with that rationale, because it's missing the point.

Nobody is stupid enough not to know that they always receive the
specified type. There's just a big difference between knowing that you
will receive a i.e. boolean, and knowing that the user passed a
boolean.

Cheers,
Andrey.

10 years ago by Pierre Joye — view source

unread

Am 17.02.2015 um 12:30 schrieb Leigh:

And you find taking authority over a library away from the library
author completely acceptable?

If I write an API that works perfectly well in strict mode, why
shouldn't I be able to turn strict on for my whole library? Do I just
tell users that non-strict mode constitutes undefined behavior for
this library, and refuse to fix any bugs that come up because of it?

As the library author you will never ever notice if your library was
called in strict mode or not! And that is the point: you will not get
any gain from making all your parameters strict, you will just force the
user to cast (as Rasmus said already).

No, and Rasmus examples, while being technically correct for some of
them, just add confusions to the stack. The caller, and the calls to
internals function in the caller codes, won't be affected, at all.
Please understand it. Only the library code will.

Now we can surely find other cases where we may adapt the patch or be
more obvious, but for my own sake, get over this "it will break and
change everything everywhere", it does not.

Repeating that strict mode is required from a library author's point of
view does not make it right. You always get the types you want, you just
limit the library consumer.

No, you do not.

10 years ago by Sara Golemon — view source

unread

We can sigh and tut about this not being "the PHP way", but the script
author was the one who chose to enter into a tight contract, and the
script author, not you, is the one who should have that authority over
their own application.

I find this view way too extreme.

You find giving authority over an application to the application
author too extreme?

And you find taking authority over a library away from the library
author completely acceptable?

I'm not suggesting taking authority of the library away from its
author, so you question is invalid.

If I write an API that works perfectly well in strict mode, why
shouldn't I be able to turn strict on for my whole library? Do I just
tell users that non-strict mode constitutes undefined behavior for
this library, and refuse to fix any bugs that come up because of it?

This RFC allows you to turn on strict for your library. What it
doesn't do is allow you to turn on strict for the calling application.
What's yours is yours, what's theirs is theirs.

-Sara

10 years ago by Larry Garfield — view source

unread

Don't mistake me for hack. -Sara

No one could ever mistake you for a hack, Sara. :-)

--Larry Garfield

(Sorry, it was just sitting there...)

10 years ago by Stanislav Malyshev — view source

unread

Hi!

So let's talk compromise.
Would leaving internal functions out of the picture at this stage
change you mind? This is effectively what Hack does, internal
functions are explicitly marked as "coercible".

For me, the option that makes users remember which functions are
internal and which are not, because they work radically different (and I
don't mean in some small detail, I mean up to a complete failure if I
get it wrong) is not something I'd really like.

Would a tri-state option make sense? ('weak-all',
'strict-user/weak-internal', 'strict-all')
How do we get from here to something you would like?

Two semantics in the same language are bad enough. Three, IMHO, is just
a no go, dealing with code having three different semantics would be
completely impossible.

Stas Malyshev
smalyshev@gmail.com

10 years ago by Larry Garfield — view source

unread

Once again, anyone can take over version 0.3, if it is so great. Why don't you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's needs whether they would admit it or not, and who's only
failing (subjectively termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to
be, but I'm willing to read the winds and modify it for v0.4.

Straw poll:

<?php strict;

<?php-strict

use strict; (psuedo-namespace)

<?php // strict (I don't actually like HHVM's style, but if you do...)

declare(strict=true); (As a top-level declare only)

declare(strict=true); (exactly as in v0.3 -- maybe you liked it)

your write-in vote here

I'm not going to scope in union types, nullables, or falsables. We
can leave that for a followup RFC, this one is contentious enough as
it is.

-Sara

As ugly as the declare() syntax is, "strict" means very little. What
else qualifies as "strict"? It also seems likely to be confused with
E_STRICT, which is something else.

So I suppose I'd hold my nose and stick with 4 or 5.

That said, I think it was Zeev (or maybe Andi?) that suggested
tightening the coercion rules at the same time to reduce the need for
strict mode. (Eg, "99 red balloons" silently becoming int 99 is asking
for trouble even in weak-mode.) I don't know if it's possible to tidy
that up as well at this point, but it's a fair point that the coercion
now is sometimes too forgiving.

I don't know enough about the internals code to suggest how to address
Rasmus' points, but they sound valid from the outside, at least. Not
steering people toward "well just throw casts around" is something to
keep in mind.

--Larry Garfield

10 years ago by Zeev Suraski — view source

unread

-----Original Message-----
From: php@golemon.com [mailto:php@golemon.com] On Behalf Of Sara
Golemon
Sent: Tuesday, February 17, 2015 1:58 AM
To: francois@php.net
Cc: Philip Sturgeon; Arvids Godjuks; Jefferson Gonzalez; Rowan Collins;
PHP
internals
Subject: Re: [PHP-DEV] Reviving scalar type hints

On Mon, Feb 16, 2015 at 2:50 PM, François Laupretre francois@php.net
wrote:

Once again, anyone can take over version 0.3, if it is so great. Why
don't
you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's
needs whether they would admit it or not, and who's only failing
(subjectively
termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to be,
but
I'm willing to read the winds and modify it for v0.4.

Straw poll:

<?php strict;

I like this one best. Obviously it requires some extra definition (e.g. I
assume it will only be possible at the first block, perhaps only as the
first part of the file or such).

That syntax poll aside, I had what I hope is some sort an enlightenment, and
I think I know what will get me to cast my vote in favor of 'strict', as a
true supporter. There's one very special conversion that's too common in
PHP to dismiss (IMHO), and is also completely harmless in 99.999% of the
cases, if not strictly 100%:

string->int or string->float conversions, when the string looks exactly
like a number. Given almost all of our inputs (from databases, forms,
files, etc.) come in string form, that's by far the conversion that's going
to give us the most trouble - and is most probably the use case that by far
is going to result the largest amount of explicit casts to work around that
problem.

What if strict mode didn't just blindly check zval.type, but actually
allowed for this one type of conversion to go through without a problem?

No bool to int. No Apple to int. No "123 testing" to float. Just "32" to
32 and "37.7" to 37.7.

Can the strict supporting camp consider that option? Judging by the
examples brought up by proponents of v0.3, this type of conversion isn't
their issue. It's the 'obviously wrong' conversions that bug them.

As a secondary concern I'd put int to float conversions where no
(meaningful) data is being lost, e.g. 37 -> 37.0, even though a tiny bit of
accuracy is lost. Rejecting this particular conversion places us at being
stricter than mostly all other languages. That said, it's a much less
common conversion so I don't feel nearly as strongly about it as the string
one.

Last, regarding internal functions, I do agree with Rasmus and Drew. Again,
judging by the examples brought up as a part of the discussion, it seems as
if we gave ZPP developers a standard easy way of being a lot more picky
about the types of values of particular arguments, and allowing them to do
this selectively for those places that are very sensitive (e.g.
curl_setopt()) would be a more suitable solution than just turning strict
validation for all internal functions, which were built with very very
different assumptions in mind. The biggest difference between internal
functions and userland functions in the context of this RFC is that with
userland functions, we're starting with a clean slate. There are no type
hint definitions anywhere, developers can selectively add them as they see
fit. With internal functions, we have type requirements EVERYWHERE already,
but with semantics that didn't take strict typing into account at all and
evolved for almost 20 years. Strict typing the way it is now would
radically change these semantics overnight, and will make the adoption of
strict a much bigger headache than it can be, as I believe Rasmus
demonstrated.

I think addressing these issues could get us a LOT closer to consensus and
make a lot of those who voted 'no' on v0.3 vote yes on v0.4.

Thanks for your consideration.

Zeev

10 years ago by Sara Golemon — view source

unread

That syntax poll aside, I had what I hope is some sort an enlightenment, and
I think I know what will get me to cast my vote in favor of 'strict', as a
true supporter. There's one very special conversion that's too common in
PHP to dismiss (IMHO), and is also completely harmless in 99.999% of the
cases, if not strictly 100%:

string->int or string->float conversions, when the string looks exactly
like a number. Given almost all of our inputs (from databases, forms,
files, etc.) come in string form, that's by far the conversion that's going
to give us the most trouble - and is most probably the use case that by far
is going to result the largest amount of explicit casts to work around that
problem.

What if strict mode didn't just blindly check zval.type, but actually
allowed for this one type of conversion to go through without a problem?

No bool to int. No Apple to int. No "123 testing" to float. Just "32" to
32 and "37.7" to 37.7.

Can the strict supporting camp consider that option? Judging by the
examples brought up by proponents of v0.3, this type of conversion isn't
their issue. It's the 'obviously wrong' conversions that bug them.

As a secondary concern I'd put int to float conversions where no
(meaningful) data is being lost, e.g. 37 -> 37.0, even though a tiny bit of
accuracy is lost. Rejecting this particular conversion places us at being
stricter than mostly all other languages. That said, it's a much less
common conversion so I don't feel nearly as strongly about it as the string
one.

Last, regarding internal functions, I do agree with Rasmus and Drew. Again,
judging by the examples brought up as a part of the discussion, it seems as
if we gave ZPP developers a standard easy way of being a lot more picky
about the types of values of particular arguments, and allowing them to do
this selectively for those places that are very sensitive (e.g.
curl_setopt()) would be a more suitable solution than just turning strict
validation for all internal functions, which were built with very very
different assumptions in mind. The biggest difference between internal
functions and userland functions in the context of this RFC is that with
userland functions, we're starting with a clean slate. There are no type
hint definitions anywhere, developers can selectively add them as they see
fit. With internal functions, we have type requirements EVERYWHERE already,
but with semantics that didn't take strict typing into account at all and
evolved for almost 20 years. Strict typing the way it is now would
radically change these semantics overnight, and will make the adoption of
strict a much bigger headache than it can be, as I believe Rasmus
demonstrated.

I think addressing these issues could get us a LOT closer to consensus and
make a lot of those who voted 'no' on v0.3 vote yes on v0.4.

So, if you'll permit me to summarize your message. The following
would be palatable to you?

Lossless coercion. This would sit somewhere between strict types
and weak types as lossy conversions (object->__toString() for objects
passed where string expected, any->bool) would be disallowed, but
lossless conversions (numeric strings to number types, int to float,
whole floats to ints, numbers to strings -- But no implicit
conversions to bools, no non-numeric strings to numerics, etc...)
Exclude internal functions from the strict switch. (Perhaps have a
separate switch for internal functions at a later date)

With option to introduce features such as the following at a later date:

Union types (e.g. function foo((int | float) $value): (bool | string) { ... })
Typedefs (e.g. TypeDef (int|float) numeric; -- Some defined as
standard (like numeric), others user-definable)

-Sara

10 years ago by Zeev Suraski — view source

unread

In a word, yes. Have to say you're abilities to compile Zeev -> formal declaration are pretty amazing :)

Zeev

That syntax poll aside, I had what I hope is some sort an enlightenment, and
I think I know what will get me to cast my vote in favor of 'strict', as a
true supporter. There's one very special conversion that's too common in
PHP to dismiss (IMHO), and is also completely harmless in 99.999% of the
cases, if not strictly 100%:

string->int or string->float conversions, when the string looks exactly
like a number. Given almost all of our inputs (from databases, forms,
files, etc.) come in string form, that's by far the conversion that's going
to give us the most trouble - and is most probably the use case that by far
is going to result the largest amount of explicit casts to work around that
problem.

What if strict mode didn't just blindly check zval.type, but actually
allowed for this one type of conversion to go through without a problem?

No bool to int. No Apple to int. No "123 testing" to float. Just "32" to
32 and "37.7" to 37.7.

Can the strict supporting camp consider that option? Judging by the
examples brought up by proponents of v0.3, this type of conversion isn't
their issue. It's the 'obviously wrong' conversions that bug them.

As a secondary concern I'd put int to float conversions where no
(meaningful) data is being lost, e.g. 37 -> 37.0, even though a tiny bit of
accuracy is lost. Rejecting this particular conversion places us at being
stricter than mostly all other languages. That said, it's a much less
common conversion so I don't feel nearly as strongly about it as the string
one.

Last, regarding internal functions, I do agree with Rasmus and Drew. Again,
judging by the examples brought up as a part of the discussion, it seems as
if we gave ZPP developers a standard easy way of being a lot more picky
about the types of values of particular arguments, and allowing them to do
this selectively for those places that are very sensitive (e.g.
curl_setopt()) would be a more suitable solution than just turning strict
validation for all internal functions, which were built with very very
different assumptions in mind. The biggest difference between internal
functions and userland functions in the context of this RFC is that with
userland functions, we're starting with a clean slate. There are no type
hint definitions anywhere, developers can selectively add them as they see
fit. With internal functions, we have type requirements EVERYWHERE already,
but with semantics that didn't take strict typing into account at all and
evolved for almost 20 years. Strict typing the way it is now would
radically change these semantics overnight, and will make the adoption of
strict a much bigger headache than it can be, as I believe Rasmus
demonstrated.

I think addressing these issues could get us a LOT closer to consensus and
make a lot of those who voted 'no' on v0.3 vote yes on v0.4.
So, if you'll permit me to summarize your message. The following
would be palatable to you?

Lossless coercion. This would sit somewhere between strict types
and weak types as lossy conversions (object->__toString() for objects
passed where string expected, any->bool) would be disallowed, but
lossless conversions (numeric strings to number types, int to float,
whole floats to ints, numbers to strings -- But no implicit
conversions to bools, no non-numeric strings to numerics, etc...)

Exclude internal functions from the strict switch. (Perhaps have a
separate switch for internal functions at a later date)

With option to introduce features such as the following at a later date:

Union types (e.g. function foo((int | float) $value): (bool | string) { ... })

Typedefs (e.g. TypeDef (int|float) numeric; -- Some defined as
standard (like numeric), others user-definable)

-Sara

10 years ago by Lester Caine — view source

unread

Typedefs (e.g. TypeDef (int|float) numeric; -- Some defined as
standard (like numeric), others user-definable)

And also ... int4, int8 and similar for correctly constrained values.
In an ideal world the whole SQL standard types would be available, but
this at least would allow int to become an unconstrained object if
people want that.

It's the whole "we can fix it later" that I don't like ... especially
when other votes are changing the goal posts in parallel.

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

10 years ago by francois@php.net — view source

unread

Hi Sara,

De : php@golemon.com [mailto:php@golemon.com] De la part de Sara
Golemon

So, if you'll permit me to summarize your message. The following
would be palatable to you?

Lossless coercion. This would sit somewhere between strict types
and weak types as lossy conversions (object->__toString() for objects
passed where string expected, any->bool) would be disallowed, but
lossless conversions (numeric strings to number types, int to float,
whole floats to ints, numbers to strings -- But no implicit
conversions to bools, no non-numeric strings to numerics, etc...)

it must be implemented as a modification to zpp conversion rules. IMHO, a third mode would be the worst thing to do.
It is not not about being lossless or not. People expect bool -> int to be disabled, for example, and it is not lossless.
It is more a question of finding a consensus about conversions which don't make sense, and disabling them. Examples include bool conversion to any other type and, of course, disabling trailing chars in numeric strings.
Once this is done, 99% of strict type proponents already said they would be OK with the so-called (not so) weak mode, making it possible to get rid of this far from perfect two-mode mechanism (actually, this two-mode approach more and more reminds me a great idea about bringing transparent Unicode support in PHP).

If future development tools require more precise types (and they probably will), the mechanism can be extended with new keywords, but the idea is never using the same keyword for different ways to check a zval. New keywords would be defined at the zpp level, keeping internal and userspace features in sync.

Would you agree proposing this as an option in 0.4 ? Actually, I will write it as a separate RFC as modifying zpp conversion rules has impacts well beyond type hinting. Then, type hinting RFCs can reference it.

Exclude internal functions from the strict switch. (Perhaps have a
separate switch for internal functions at a later date)

This would make the feature inconsistent from an end user's pov.

If a user enables strict mode, he assumes strict checks for every function he calls. We know that userspace and internal functions use different mechanisms, but the end user doesn't have to know. From his pov, a function is a function. If the documentation states that the function he's calling in strict mode accepts int, he will expect int to be checked in strict mode. We cannot say 'Oh, wait, it's an internal function, the rules are not the same'. The distinction between internal and userpace functions is irrelevant for him and has to remain so.

So, from all these arguments, I now think that strict types, as defined in 0.3, are not the best solution.

With option to introduce features such as the following at a later date:

Union types (e.g. function foo((int | float) $value): (bool | string) { ... })

Typedefs (e.g. TypeDef (int|float) numeric; -- Some defined as
standard (like numeric), others user-definable)

As I told Zeev, union types can be kept for the future if we don't go the nullable road, as it would be too confusing making 'string|null' and '?string' synonyms.

The question of null is not so simple. IMO, it should be left for a future discussion about union types. Anyway, once people get used to scalar hinting, the need for union types will arise quickly.

Regards

10 years ago by Zeev Suraski — view source

unread

-----Original Message-----
From: François Laupretre [mailto:francois@php.net]
Sent: Tuesday, February 17, 2015 2:58 PM
To: 'Sara Golemon'; 'Zeev Suraski'
Cc: 'PHP internals'
Subject: RE: [PHP-DEV] Reviving scalar type hints

It is not not about being lossless or not. People expect bool -> int to
be
disabled, for example, and it is not lossless.

It is more a question of finding a consensus about conversions which
don't
make sense, and disabling them. Examples include bool conversion to any
other type and, of course, disabling trailing chars in numeric strings.

I agree. It's more of a question of eliminating potentially dangerous
conversions than just being lossless.

Once this is done, 99% of strict type proponents already said they
would
be OK with the so-called (not so) weak mode, making it possible to get rid
of
this far from perfect two-mode mechanism (actually, this two-mode
approach more and more reminds me a great idea about bringing
transparent Unicode support in PHP).

Even though that's not what I meant when I sent my proposal in the morning,
I've been wondering about the same thing (also with the feedback from
Dmitry). Can go an extra step from both directions, and come up with a rule
set that is stricter than the currently proposed weak hinting, but not as
strict as the currently proposed strict hinting?
Key challenge I see with that is that scalar type hinting would go farther
apart from our implicit casting rules. However, the current RFC already
aligned with ZPP as opposed to implicit casting rules (e.g. by rejecting
"Apple" as an int). Choosing between that and having two separate modes, I
think that's the better option.

Exclude internal functions from the strict switch. (Perhaps have a
separate switch for internal functions at a later date)

This would make the feature inconsistent from an end user's pov.

If a user enables strict mode, he assumes strict checks for every function
he
calls. We know that userspace and internal functions use different
mechanisms, but the end user doesn't have to know. From his pov, a
function
is a function.

I think that practically speaking, that is incorrect, at least from my
experience with PHP developers. They do differentiate between built-in
functions and userland functions. There are some fundamental differences
between the two (being able to find their code, step into them in a
debugger, find docs on php.net, etc.).
As I mentioned earlier, the fundamental difference between built-in
functions and userland functions in the context of our discussion is that if
we introduce userland type hints, nothing happens before people change their
code, and make (hopefully) informed decisions about what type hints to add,
if any. No such luck with built-in functions, which have type information
associated, built collectively over the last two decades. As Rasmus
demonstrated, flipping that switch on for built-in functions results in a
lot of work to 'clean' the code up, but you end up with having code that's
not necessarily any better. That said, it's quite possible that the
situation will be much improved if & when we implement the less-strict rules
we're proposing here, which would accept "32" as an integer or 37 as a
float.

If we still see that employing the strict(er) rules is very noisy with
internal functions, a more appropriate option may be introducing new types
into ZPP, that would correspond to the new rules we introduce in the
userland type hints, and requiring extension authors to explicitly move to
them where they believe it's appropriate. That will allow extension authors
to make their choice regarding their APIs, similarly to the process that
will happen in userland.

So, from all these arguments, I now think that strict types, as defined in
0.3,
are not the best solution.

With option to introduce features such as the following at a later date:

Union types (e.g. function foo((int | float) $value): (bool |
string) { ... })

Typedefs (e.g. TypeDef (int|float) numeric; -- Some defined as
standard (like numeric), others user-definable)

As I told Zeev, union types can be kept for the future if we don't go the
nullable road, as it would be too confusing making 'string|null' and
'?string'
synonyms.

I think we all agree about that.

Zeev

10 years ago by Andrey Andreev — view source

unread

Hi,

-----Original Message-----
From: François Laupretre [mailto:francois@php.net]
Sent: Tuesday, February 17, 2015 2:58 PM
To: 'Sara Golemon'; 'Zeev Suraski'
Cc: 'PHP internals'
Subject: RE: [PHP-DEV] Reviving scalar type hints

It is not not about being lossless or not. People expect bool -> int to
be
disabled, for example, and it is not lossless.

It is more a question of finding a consensus about conversions which
don't
make sense, and disabling them. Examples include bool conversion to any
other type and, of course, disabling trailing chars in numeric strings.

I agree. It's more of a question of eliminating potentially dangerous
conversions than just being lossless.

Agreed as well. However, while bool -> int conversion one of the
reasons why many people want strict type-hints, it also often makes
sense and is quite widespread. There's no silver bullet for that
problem.

Once this is done, 99% of strict type proponents already said they
would
be OK with the so-called (not so) weak mode, making it possible to get rid
of
this far from perfect two-mode mechanism (actually, this two-mode
approach more and more reminds me a great idea about bringing
transparent Unicode support in PHP).

Even though that's not what I meant when I sent my proposal in the morning,
I've been wondering about the same thing (also with the feedback from
Dmitry). Can go an extra step from both directions, and come up with a rule
set that is stricter than the currently proposed weak hinting, but not as
strict as the currently proposed strict hinting?
Key challenge I see with that is that scalar type hinting would go farther
apart from our implicit casting rules. However, the current RFC already
aligned with ZPP as opposed to implicit casting rules (e.g. by rejecting
"Apple" as an int). Choosing between that and having two separate modes, I
think that's the better option.

I don't think it has 99% support, but surely better than the dual-mode
approach indeed.

But is it the best solution? The dual-mode approach was suggested
because there is need and demand for two kinds of type-hinting. Most
of the controversy and criticism came from the fact that introduces a
switchable mode, while most of the praise received was due to somebody
finally proposing both solutions at the same time.

Exclude internal functions from the strict switch. (Perhaps have a
separate switch for internal functions at a later date)

This would make the feature inconsistent from an end user's pov.

If a user enables strict mode, he assumes strict checks for every function
he
calls. We know that userspace and internal functions use different
mechanisms, but the end user doesn't have to know. From his pov, a
function
is a function.

I think that practically speaking, that is incorrect, at least from my
experience with PHP developers. They do differentiate between built-in
functions and userland functions. There are some fundamental differences
between the two (being able to find their code, step into them in a
debugger, find docs on php.net, etc.).
As I mentioned earlier, the fundamental difference between built-in
functions and userland functions in the context of our discussion is that if
we introduce userland type hints, nothing happens before people change their
code, and make (hopefully) informed decisions about what type hints to add,
if any. No such luck with built-in functions, which have type information
associated, built collectively over the last two decades. As Rasmus
demonstrated, flipping that switch on for built-in functions results in a
lot of work to 'clean' the code up, but you end up with having code that's
not necessarily any better.

Complely agree with that statement. I tried to explain the same thing
multiple times already ... unfortunately with no success.

That said, it's quite possible that the
situation will be much improved if & when we implement the less-strict rules
we're proposing here, which would accept "32" as an integer or 37 as a
float.

That might work, if we're indeed looking for a compromise.

If we still see that employing the strict(er) rules is very noisy with
internal functions, a more appropriate option may be introducing new types
into ZPP, that would correspond to the new rules we introduce in the
userland type hints, and requiring extension authors to explicitly move to
them where they believe it's appropriate. That will allow extension authors
to make their choice regarding their APIs, similarly to the process that
will happen in userland.

And that brings us back to square one ... Expose only 1 tool to
userland, but then give two options to the much less-populated crowd
of extension developers. That doesn't make sense to me.

So, from all these arguments, I now think that strict types, as defined in
0.3,
are not the best solution.

With option to introduce features such as the following at a later date:

Union types (e.g. function foo((int | float) $value): (bool |
string) { ... })

Typedefs (e.g. TypeDef (int|float) numeric; -- Some defined as
standard (like numeric), others user-definable)

As I told Zeev, union types can be kept for the future if we don't go the
nullable road, as it would be too confusing making 'string|null' and
'?string'
synonyms.

I think we all agree about that.

Yep.

Cheers,
Andrey.

10 years ago by Lester Caine — view source

unread

I agree. It's more of a question of eliminating potentially dangerous

conversions than just being lossless.

Agreed as well. However, while bool -> int conversion one of the
reasons why many people want strict type-hints, it also often makes
sense and is quite widespread. There's no silver bullet for that
problem.

Returning 'not-zero/empty' as true and 'zero' as false is one of the
natural things to use in PHP and I don't think any other language has
that flexibility? It is also why some of the other 'little changes' such
as hard coded IS_TRUE and IS_FALSE are actually somewhat alien!
Certainly is does not play well with my methods of working, but then I
prefer a function to return a result rather than crash out with an
exception ... Although -ve values are even more useful than a simple
'zero' return and that may replace a string return.

--
Lester Caine - G8HFL

10 years ago by francois@php.net — view source

unread

De : Lester Caine [mailto:lester@lsces.co.uk]

Returning 'not-zero/empty' as true and 'zero' as false is one of the
natural things to use in PHP and I don't think any other language has
that flexibility?

You didn't read it right.

I was talking of conversions from bool, not to bool. (int -> bool) is fine and will be preserved, but I propose to remove (bool -> int). You will still return numbers as bool, and non-zero will still be converted to true. Relax :)

Regards

François

10 years ago by Lester Caine — view source

unread

Returning 'not-zero/empty' as true and 'zero' as false is one of the

natural things to use in PHP and I don't think any other language has
that flexibility?
You didn't read it right.

I was talking of conversions from bool, not to bool. (int -> bool) is fine and will be preserved, but I propose to remove (bool -> int). You will still return numbers as bool, and non-zero will still be converted to true. Relax :)

My current practice up until now has been to use 'return false' when an
action failed, but the main return would be a number of records or
string of data. So you are now blocking that activity ... I'm reading to
right, but you are not thinking all possibilities through. But I think
I'm starting to see it broken already with the other changes to the core
:( I'm not returning 'IS_FALSE' so I'm probably going to have to change
the 'false' to '0' anyway so as to avoid the bool?

--
Lester Caine - G8HFL

10 years ago by francois@php.net — view source

unread

De : Lester Caine [mailto:lester@lsces.co.uk]

My current practice up until now has been to use 'return false' when an
action failed, but the main return would be a number of records or
string of data. So you are now blocking that activity ... I'm reading to
right, but you are not thinking all possibilities through. But I think
I'm starting to see it broken already with the other changes to the core
:( I'm not returning 'IS_FALSE' so I'm probably going to have to change
the 'false' to '0' anyway so as to avoid the bool?

Hi Lester,

I am not blocking anything. My objective is to provide union types. Using union types, you will declare your return type as 'int|bool'. That's why I was pushing to integrate this feature in the first release. It will be done if we have enough time, but it is a lot to integrate in a discussion that must be quite short, as time is restricted.

I am surely not thinking all possibilities :) but returning integer or false is usual and in scope. The solution is not to authorize (bool -> int) conversion (as it would have to support (bool -> anything)), but support 'int|bool', 'resource|bool', and similar syntax. So, the solution is completely different from the (int -> bool) question.

Even, if we don't release union types in 7.0, it will be clearly stated that it is a required follow-up. That's not perfect but we do it as fast as we can and everyone is welcome to help.

What does this mean in your case ? Just that, as long as the feature is not available, your function won't have an explicit return type. Period. And, please, don't change false to 0 ;).

Regards

François

10 years ago by Lester Caine — view source

unread

What does this mean in your case ? Just that, as long as the feature is not available, your function won't have an explicit return type. Period. And, please, don't change false to 0 ;).

I simply can't see the case for limited function type hints at all! I
either already have clean defined data from the database, or I need to
validate the data from users before using it. While validating I need to
confirm constraints of data type so adding some extra wrapper that only
does half the job just seems a pointless exercise. Annotating the
correct data type would be of more use and I already have that in the
docblock and my IDE produces those hints while I am writing the code -
which it has done for many years. To my mind it IS in the IDE that much
of this stuff which people keep saying is not 'runtime' should be
managed, and anything that is not needed at 'runtime' should be
removable but what is being added across several areas all seem to
beadding the same things - partially - using different methods - without
any obvious gain. Additionally I'm now passing data as an array as that
was the 'best practice' a few years back so it is rare to be passing a
single value anyway.

--
Lester Caine - G8HFL

10 years ago by francois@php.net — view source

unread

De : Lester Caine [mailto:lester@lsces.co.uk]

What does this mean in your case ? Just that, as long as the feature is not
available, your function won't have an explicit return type. Period. And,
please, don't change false to 0 ;).

I simply can't see the case for limited function type hints at all! I

If you can't see it after so much was written on the subject, what can we do ? Do you imply that, if you cannot understand the need, it does not exist ?

either already have clean defined data from the database, or I need to
validate the data from users before using it.

If that's your only data source, that's OK. I confirm you probably don't need type hinting. You probably even don't need functions, classes and the rest. A 50-line script should fit.

While validating I need to
confirm constraints of data type so adding some extra wrapper that only
does half the job just seems a pointless exercise.

The point of type hinting is not validating user input.

Annotating the
correct data type would be of more use and I already have that in the
docblock and my IDE produces those hints while I am writing the code -
which it has done for many years.

That's different. There is overlapping there but the purpose is not the same. IDEs only can do a limited set of static analysis. As soon as you have indirect calls, IDE-based static analysis is down, let alone comment-stripped libraries and others. If you want more constraints on input and output, look at DbC (design by contract), as it can handle tests too slow to run in production.

To my mind it IS in the IDE that much

of this stuff which people keep saying is not 'runtime' should be
managed, and anything that is not needed at 'runtime' should be
removable but what is being added across several areas all seem to
beadding the same things - partially - using different methods - without
any obvious gain. Additionally I'm now passing data as an array as that
was the 'best practice' a few years back so it is rare to be passing a
single value anyway.

IDEs are worthwile but not mandatory and runtime features have nothing to do with IDEs.

Now, you were (aggressively) complaining about returning int or false. You were sure you had found the case that would prove all of this was ready for the bin. I took the time to explain. Then, you're complaining it's no use because you don't understand its purpose and because you chose to bundle your arguments in arrays...

Unfortunately, I'm afraid I can't do more for you but what I generally hate : 'If we don't like it, don't use it'.

I try to be kind with constructive posts but, here, I have better to do.

François

10 years ago by Lester Caine — view source

unread

De : Lester Caine [mailto:lester@lsces.co.uk]

What does this mean in your case ? Just that, as long as the feature is not
available, your function won't have an explicit return type. Period. And,
please, don't change false to 0 ;).

I simply can't see the case for limited function type hints at all! I

If you can't see it after so much was written on the subject, what can we do ? Do you imply that, if you cannot understand the need, it does not exist ?

Since it has already been said that what is proposed is 'just a start'
then the missing bits may get added later - and if it is anything like
PDO will it arrive before PHP10 :)

either already have clean defined data from the database, or I need to
validate the data from users before using it.

If that's your only data source, that's OK. I confirm you probably don't need type hinting. You probably even don't need functions, classes and the rest. A 50-line script should fit.

http://hg.lsces.org.uk/bw/bitweaver/ is a little more than 50 lines ...
lsces.org.uk/bitweaverdocsPHP/index.html hasn't been updated since we
lost the original phpDocumentor but now I do need to try and get a new
version run including all the e_strict stuff that hs been reworked in
the last 5 years.

While validating I need to
confirm constraints of data type so adding some extra wrapper that only
does half the job just seems a pointless exercise.

The point of type hinting is not validating user input.

Again limited application of a useful function. Strict scalar type hints
are only there to give an error when something is wrong. How is that not
doing validation? If you HAVE validated the data then where do you need
the hint other than reminding you just what you need to validate to? And
if you have validated then one can pass a value rather than a string anyway?

Annotating the
correct data type would be of more use and I already have that in the
docblock and my IDE produces those hints while I am writing the code -
which it has done for many years.

That's different. There is overlapping there but the purpose is not the same. IDEs only can do a limited set of static analysis. As soon as you have indirect calls, IDE-based static analysis is down, let alone comment-stripped libraries and others. If you want more constraints on input and output, look at DbC (design by contract), as it can handle tests too slow to run in production.

I would not even bother trying to understand the PHP code base without a
decent IDE, and Eclipse provides that. PHP files live next to C/C++ and
other file formats, so the one set of tools allow everything to work
productively. I don't find any problem seeing indirect call annotation
and although PHPeclipse is now struggling with new things ( like
dropping the ?> in the MIDDLE of phpt files :( ) it has done the jobs
that most of this extra infrastructure is trying to duplicate for a lot
of the life of PHP5.

To my mind it IS in the IDE that much

of this stuff which people keep saying is not 'runtime' should be
managed, and anything that is not needed at 'runtime' should be
removable but what is being added across several areas all seem to
beadding the same things - partially - using different methods - without
any obvious gain. Additionally I'm now passing data as an array as that
was the 'best practice' a few years back so it is rare to be passing a
single value anyway.

IDEs are worthwile but not mandatory and runtime features have nothing to do with IDEs.
Now, you were (aggressively) complaining about returning int or false. You were sure you had found the case that would prove all of this was ready for the bin. I took the time to explain. Then, you're complaining it's no use because you don't understand its purpose and because you chose to bundle your arguments in arrays...

Flagging that a value HAS to be an integer is fine, and hopefully the
move to make that an unconstrained object rather than a simple register
value has passed, but even here the next step is rather than hiding the
fact that in many cases there IS a limit, the move to 64bit values in
parallel with 32bit ones needs much better management and it's the total
disregard for that which is my problem. 'PHP has not worried about that
in the past' neatly sidesteps the fact that we USED 32bit builds of PHP5
to avoid the problems, but nowadays we have legacy systems that are
still locked to 32bit values while new systems are running 64bit and so
now it IS something to at least take some interest in. int4 and int8 are
equally as important as int(unlimited) and while ignoring 32bit builds
may be practical for some, it still has a place in many areas.

Unfortunately, I'm afraid I can't do more for you but what I generally hate : 'If we don't like it, don't use it'.

I try to be kind with constructive posts but, here, I have better to do.

I've indicated where I would have less of a problem with 'hinting' if it
was actually providing a complete set of hints. Being able to apply
those hints to the elements of an array would be useful, but this is
where 'other camps' would be expecting data to be returned as objects
with all the overheads that applies but which the current offerings
probably fit better with? Validation of data IS the problem here but
this is not a solution to that problem, just another sticking plaster.

--
Lester Caine - G8HFL

10 years ago by Zeev Suraski — view source

unread

-----Original Message-----
From: Andrey Andreev [mailto:narf@devilix.net]
Sent: Tuesday, February 17, 2015 4:49 PM
To: Zeev Suraski
Cc: francois@php.net; Sara Golemon; PHP internals
Subject: Re: [PHP-DEV] Reviving scalar type hints

Hi,

I agree. It's more of a question of eliminating potentially dangerous
conversions than just being lossless.

Agreed as well. However, while bool -> int conversion one of the reasons
why many people want strict type-hints, it also often makes sense and is
quite widespread. There's no silver bullet for that problem.

I'm not sure we need a silver bullet. If the conversion and acceptance
rules are clear and reasonable, it's an entirely valid outcome that in cases
where both int and bool are equally acceptable, you won't use a type hint
but rather explicitly cast to int inside the function.

But is it the best solution? The dual-mode approach was suggested because
there is need and demand for two kinds of type-hinting. Most of the
controversy and criticism came from the fact that introduces a switchable
mode, while most of the praise received was due to somebody finally
proposing both solutions at the same time.

Every option has pros and cons. Since it's clear beyond a reasonable doubt
that we can't all agree on purely weak type hints and equally on purely
strict type hints, it becomes a question of what is the right compromise.
Adding both - which at least from my point of view has major drawbacks (too
prominent zval.type exposure; complexity of two systems; internal
functions issue). Creating something in between that would handle most if
not all of the use cases the strict camp brought up, while not (IMHO) overly
focusing on zval.type and making things a lot more noisy/complex for
built-in functions - is a better direction, whose advantage - I think -
outweigh its disadvantages.

If we still see that employing the strict(er) rules is very noisy with
internal functions, a more appropriate option may be introducing new
types into ZPP, that would correspond to the new rules we introduce in
the userland type hints, and requiring extension authors to explicitly
move to them where they believe it's appropriate. That will allow
extension authors to make their choice regarding their APIs, similarly
to the process that will happen in userland.

And that brings us back to square one ... Expose only 1 tool to userland,
but
then give two options to the much less-populated crowd of extension
developers. That doesn't make sense to me.

First, let us hope that the situation will be much better to a level that we
don't need to go in that direction :)
But that said, I don't quite see it in the same way. Internal functions
already commonly use a lot more advanced type checking than is commonly
found in userland functions. Functions that behave differently depending on
the type of argument you pass, on the number of arguments, etc. While
technically it's possible to achieve in userland, it's a lot less common.
So while I do see an issue here, I don't think it's that bad if it has to
come to that. Unless we find out our rules work nicely out of the box for
internal functions (which I'm still somewhat hopeful for) - there's no way
to reconcile the fact that internal functions come with this long history of
having detailed type data, while userland functions do not...

Zeev

10 years ago by francois@php.net — view source

unread

De : Andrey Andreev [mailto:narf@devilix.net]

If we still see that employing the strict(er) rules is very noisy with
internal functions, a more appropriate option may be introducing new
types
into ZPP, that would correspond to the new rules we introduce in the
userland type hints, and requiring extension authors to explicitly move to
them where they believe it's appropriate. That will allow extension authors
to make their choice regarding their APIs, similarly to the process that
will happen in userland.

And that brings us back to square one ... Expose only 1 tool to
userland, but then give two options to the much less-populated crowd
of extension developers. That doesn't make sense to me.

I must say 'no'. That's completely different of dual-mode, as it was not clear, but the types we would add to ZPP would be also available in userland.

The objective is to maintain a full consistency between userland, internal funcs, and documentation.

This is so true that we'll probably, in the future, introduce ZPP supported specialized types, like path, to userland. I am also quite sure that we'll add a set of strict types, for the few cases where zval type really matters (like sorting and other 'special' stuff).

But we are not intending to make it more complex than needed for a first release. We all need to practice before identifying additional needs.

Cheers

François

10 years ago by francois@php.net — view source

unread

De : Zeev Suraski [mailto:zeev@zend.com]

Even though that's not what I meant when I sent my proposal in the
morning,

Sorry, I was not clear enough : it was my position only.

I've been wondering about the same thing (also with the feedback from
Dmitry). Can go an extra step from both directions, and come up with a rule
set that is stricter than the currently proposed weak hinting, but not as
strict as the currently proposed strict hinting?
Key challenge I see with that is that scalar type hinting would go farther
apart from our implicit casting rules. However, the current RFC already
aligned with ZPP as opposed to implicit casting rules (e.g. by rejecting
"Apple" as an int). Choosing between that and having two separate modes, I
think that's the better option.

ZPP already chose a while ago to implement a much restricted conversion set from convert_to_xxx() rules. The objective is not the same, as casting must be as permissive as possible (we could discuss that in the future too...). I hope we all agree to keep scalar type hinting aligned on ZPP rules.

I also think we all agree, while implicitely, on the ruleset changes we're expecting. I am currently writing this down, so that we have a written base to discuss.

Just one detail, to be sure we are in sync : new conversion rules will remain in sync between zend_parse_parameters() and zpp, while future additional types will be defined in zpp only. I assume we are OK to refuse extending the zend_parse_parameters() format in the future.

I think that practically speaking, that is incorrect, at least from my
experience with PHP developers. They do differentiate between built-in
functions and userland functions. There are some fundamental differences
between the two (being able to find their code, step into them in a
debugger, find docs on php.net, etc.).
As I mentioned earlier, the fundamental difference between built-in
functions and userland functions in the context of our discussion is that if
we introduce userland type hints, nothing happens before people change
their
code, and make (hopefully) informed decisions about what type hints to add,
if any. No such luck with built-in functions, which have type information
associated, built collectively over the last two decades. As Rasmus
demonstrated, flipping that switch on for built-in functions results in a
lot of work to 'clean' the code up, but you end up with having code that's
not necessarily any better. That said, it's quite possible that the
situation will be much improved if & when we implement the less-strict rules
we're proposing here, which would accept "32" as an integer or 37 as a
float.

I fully agree with you and Rasmus. Strict typing, as defined in 0.3, does not fit with internal functions.

To summarize, we have two options :

differentiate behavior and run weak checks on internal functions, even if strict switch is on.
attempt to get back to a single-mode mechanism, which would keep both worlds in sync.

I definitely prefer to explore the second one.

If we still see that employing the strict(er) rules is very noisy with
internal functions, a more appropriate option may be introducing new types
into ZPP, that would correspond to the new rules we introduce in the
userland type hints,

Why not, but let's first try avoiding it first, as we'll bikeshed during weeks on the syntax for new keywords.

Remember that class names share the same naming space as type hint keywords. So, defining new keywords will lead to endless discussions. And stating, as I heard, that we just have to use lowercase-only keywords, is not serious. That's a little off-topic here, but I would also propose to deprecate using bare class names as type hints from 7.0. That's a big BC change and a lot will hate that, but the issue was created when the 'array' type hint was added, and we need to fix it (proposing an alternate syntax like 'object(classname)'). As long as we support bare class names, adding a new type is almost impossible and typedef is totally impossible. It's just deprecating, but the impact is such that it can come with a major version only. A smoother path can be to introduce the alternate syntax in 7.0, then deprecate in 7.1 or 7.2. It will take more time but it's probably better.

union types can be kept for the future if we don't go the
nullable road, as it would be too confusing making 'string|null' and
'?string'
synonyms.

I think we all agree about that.

Fine :) !

Regards

François

10 years ago by Anthony Ferrara — view source

unread

Zeev et al,

It is not not about being lossless or not. People expect bool -> int to
be
disabled, for example, and it is not lossless.

It is more a question of finding a consensus about conversions which
don't
make sense, and disabling them. Examples include bool conversion to any
other type and, of course, disabling trailing chars in numeric strings.

I agree. It's more of a question of eliminating potentially dangerous
conversions than just being lossless.

Disagree. For a weak-mode, that does solve the problems that many
people want solved. But it also ignores a lot of the problems that the
strict crowd want solved.

Let me show by example:

function convertToInt(string $number): int {
if (!preg_match("(^[0-9]{1,17}$)", $number)) {
throw new InvalidArgumentException("Supplied argument is not a
valid number");
}
return $number;
}

From a weak standpoint that looks like it should work. And from a
pragmatic standpoint it should work as well. But from a static
analysis standpoint, we can't tell.

A static analyzer (one of the reasons people want strict) would error
there. The reason is that at compile time it can't reason about the
code well enough to determine if there's an error or not. You're
passing a string where you expect an int. Is that going to work? We
don't know. So the analyzer would need to throw a warning that the
cast is potentially unsafe because it can't guarantee that the runtime
won't throw an error. Which means that to remove the warning you'd
need to add an explicit cast.

At which point what does the weak mode buy us?

Instead, if we make the strict mode behave based on types alone, this
wouldn't be a problem (because we can detect ahead of time to 100%
accuracy type errors in the strict mode files). And hence prevent
errors at build time, rather than detecting them at runtime in prod.

associated, built collectively over the last two decades. As Rasmus
demonstrated, flipping that switch on for built-in functions results in a
lot of work to 'clean' the code up, but you end up with having code that's
not necessarily any better. That said, it's quite possible that the
situation will be much improved if & when we implement the less-strict rules
we're proposing here, which would accept "32" as an integer or 37 as a
float.

And

Every option has pros and cons. Since it's clear beyond a reasonable doubt
that we can't all agree on purely weak type hints and equally on purely
strict type hints, it becomes a question of what is the right compromise.
Adding both - which at least from my point of view has major drawbacks (too
prominent zval.type exposure; complexity of two systems; internal
functions issue). Creating something in between that would handle most if
not all of the use cases the strict camp brought up, while not (IMHO) overly
focusing on zval.type and making things a lot more noisy/complex for
built-in functions - is a better direction, whose advantage - I think -
outweigh its disadvantages.

Again, that changes the type checks from type checks to value checks.
Which is fine for a "weak" mode (default mode of operation), but
really throws away a large part of what many strict proponents
want/need. And they won't be able to be satisfied unless zval.type is
exposed fully. Because that's the point of strict typing and static
analysis (and hence any compromise away from it reduces the value of
the type hint to nothing).

What you call issues, I call strengths. The "internal functions issue"
is not really an issue at all to me because that's the entire point of
type safety. And the fact that internal functions are already built-in
with it is a bonus, not a drawback.

The right compromise is the system that Andrea built. Because it
wasn't a compromise (neither side had to give up anything). It
gave both sides exactly what they want and need while letting them
work together transparently. That doesn't sound like a compromise to
me. That sounds like an innovative and ingenious design. And the fact
that it hovered around 2/3 support the entire time shows that. Not to
mention that several people (Daniel and Levi specifically, but others
as well) only voted against it because of the declare semantics.

There are a few details that should be worked out:

The switch syntax, as Sara straw-polled for (my vote is
declare(strict_types=true) at the top of the file only)
The behavior of numeric types.

Now, it's been said before here on this list that "No other language
is this strict about types". And that's patently false. I've done some
research across major typed languages:

VB, C and C++: Allows almost freeform movement between numeric types:
http://en.cppreference.com/w/cpp/language/implicit_cast

Java, D, C# and Pascal follow "widening primitive" style rules:
http://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.1.2
It allows only a widening primitive conversion. That means that you
can call a function wanting a float and pass it an int. But the
opposite is not true (you can't pass a float to a function expecting
an int).

F#, Golang, Haskell, Rust: Types always require explicit conversion

Swift: Types always require explicit conversion (except for between
precisions of a type, int to long for example)

So in fact, only VB, C and C++ implement freeform movement across
numeric types. All the rest limit it in some form.

If we want to add a "numeric" type as a virtual union of int and
float, that's one way to solve the concern. If we don't, we could also
allow widening primitive conversion (int -> float). That wouldn't work
well with bigints, but would be fine in other cases. But there are
plenty of languages that always require explicit type conversion. So
even if we choose that, we're in good company.

I strongly urge you to consider not just the way you'd like to see the
language, but the way others would like to see it as well. It's clear
that there's non-trivial support for strict type-based hinting. Many
people even voted against Andrea's proposal because it was too
weak. Please don't ignore this contingent of users and use-cases
because you don't believe in the benefits of the type system.

Thanks

Anthony

10 years ago by Zeev Suraski — view source

unread

-----Original Message-----
From: Anthony Ferrara [mailto:ircmaxell@gmail.com]
Sent: Tuesday, February 17, 2015 5:48 PM
To: Zeev Suraski
Cc: francois@php.net; Sara Golemon; PHP internals
Subject: Re: [PHP-DEV] Reviving scalar type hints

Zeev et al,

Because it
wasn't a compromise (neither side had to give up anything). It gave
both
sides exactly what they want and need while letting them work together
transparently.

If it gave both sides exactly what they wanted, how come it generated so
much objection?

Simply put, because it absolutely doesn't give both sides what they wanted.
Many (most?) of those who opposed it opposed it because they believe making
zval.type as prominently available as the RFC did is bad for PHP.
Consequently, this whole 'adding both gives everyone what they want' is
simply wrong. It's not unique to this RFC either; There's a reason we
don't accept all proposals, including countless ones that have zero
compatibility/performance issues, just because we don't think they're a good
fit for the language.

Regarding your point about static analyzers, based on what I saw on this
list it hardly seems that this is the main reason most proponents of strict
types are interested in them. I, for one, think developer productivity is a
lot more important than making life easy for static analyzers, and static
analyzers should be designed around the language, not vice versa.

I urge you to consider the fact that the solution that 'gives everyone what
they wanted' is hardly that at all, and we're trying to find a potential
compromise that will hopefully have a lot fewer people objecting to it. If
we succeed, not everyone will get everything they want, but hopefully a lot
more people will be able to join the yes vote.

Zeev

10 years ago by Andrey Andreev — view source

unread

Hi,

-----Original Message-----
From: Anthony Ferrara [mailto:ircmaxell@gmail.com]
Sent: Tuesday, February 17, 2015 5:48 PM
To: Zeev Suraski
Cc: francois@php.net; Sara Golemon; PHP internals
Subject: Re: [PHP-DEV] Reviving scalar type hints

Zeev et al,

Because it
wasn't a compromise (neither side had to give up anything). It gave
both
sides exactly what they want and need while letting them work together
transparently.

If it gave both sides exactly what they wanted, how come it generated so
much objection?

Simply put, because it absolutely doesn't give both sides what they wanted.
Many (most?) of those who opposed it opposed it because they believe making
zval.type as prominently available as the RFC did is bad for PHP.
Consequently, this whole 'adding both gives everyone what they want' is
simply wrong.

I agree that it doesn't give everybody what they want - it only gave
weak hint supporters all that they want.

Many also objected because strict typing was only opt-in and could
never affect the caller's code unless the caller explicitly declares
that they want to do that. You're ignoring that and you're twisting it
the other way around.

Cheers,
Andrey.

10 years ago by Zeev Suraski — view source

unread

Hi,

If it gave both sides exactly what they wanted, how come it generated so
much objection?

Simply put, because it absolutely doesn't give both sides what they wanted.
Many (most?) of those who opposed it opposed it because they believe making
zval.type as prominently available as the RFC did is bad for PHP.
Consequently, this whole 'adding both gives everyone what they want' is
simply wrong.

I agree that it doesn't give everybody what they want - it only gave
weak hint supporters all that they want.

Andrey,

I'm a weak typing supporter; I want PHP to never make it easy at the language level to treat "32" and 32 differently; The RFC did exactly that.

-> The v0.3 RFC didn't give weak hint supporters everything they wanted. QED.

Many also objected because strict typing was only opt-in and could
never affect the caller's code unless the caller explicitly declares
that they want to do that. You're ignoring that and you're twisting it
the other way around.

It's enough to provide one counter example to disprove an assertion - the assertion that the v0.3 RFC gave everyone what they wanted - and I provided the one I can personally attest to. I certainly didn't claim strict typing supporters got everything they wanted, so I'm not sure why I'm twisting anything. If anything, you're only making the point that the v0.3 RFC doesn't give everyone what they want stronger.

I think the options we're discussing here take us away from this zero sum game, provides benefits to both schools of thought, and it seems to me as if you were open to it. I'd much rather we invested our energies there!

Zeev

10 years ago by Andrey Andreev — view source

unread

Hi,

Hi,

If it gave both sides exactly what they wanted, how come it generated so
much objection?

Simply put, because it absolutely doesn't give both sides what they wanted.
Many (most?) of those who opposed it opposed it because they believe making
zval.type as prominently available as the RFC did is bad for PHP.
Consequently, this whole 'adding both gives everyone what they want' is
simply wrong.

I agree that it doesn't give everybody what they want - it only gave
weak hint supporters all that they want.

Andrey,

I'm a weak typing supporter; I want PHP to never make it easy at the language level to treat "32" and 32 differently; The RFC did exactly that.

Yes, I already know that.
The difference, and why I keep pointing that out, is that me and many
others want strict typing for our own reasons (but still in its
entirety instead of as a limited mode) and most of us don't even care
if you getting weak typing for your own usage. You can't work towards
consensus if your target is to prevent the opposing group of getting
what they want. I see both as valuable tools for different jobs and I
want to have more tools at my disposal, while you're trying to tell me
that I should use only one tool for everything.

-> The v0.3 RFC didn't give weak hint supporters everything they wanted. QED.

Many also objected because strict typing was only opt-in and could
never affect the caller's code unless the caller explicitly declares
that they want to do that. You're ignoring that and you're twisting it
the other way around.

It's enough to provide one counter example to disprove an assertion - the assertion that the v0.3 RFC gave everyone what they wanted - and I provided the one I can personally attest to. I certainly didn't claim strict typing supporters got everything they wanted, so I'm not sure why I'm twisting anything. If anything, you're only making the point that the v0.3 RFC doesn't give everyone what they want stronger.

Yes, I am making the point stronger.

But you implied that most objections were from people who don't want
strict typing in PHP at all. And I disagree with that because it's a
speculation, which in turn you are using to favor your weak-hints-only
case (hence, twisting it in another direction).

I think the options we're discussing here take us away from this zero sum game, provides benefits to both schools of thought, and it seems to me as if you were open to it. I'd much rather we invested our energies there!

Zeev

Cheers,
Andrey.

10 years ago by Zeev Suraski — view source

unread

Yes, I already know that.
The difference, and why I keep pointing that out, is that me and many
others
want strict typing for our own reasons (but still in its entirety instead
of as a
limited mode) and most of us don't even care if you getting weak typing
for
your own usage. You can't work towards consensus if your target is to
prevent the opposing group of getting what they want. I see both as
valuable
tools for different jobs and I want to have more tools at my disposal,
while
you're trying to tell me that I should use only one tool for everything.

First, it's very important to understand that my target is to prevent the
opposing group from getting what they want. I'm really not sadistic :) My
reasons were obviously different and worked towards a different goal. Much
in the same way that people who vote against an RFC - one of the countless
that were voted against - don't do that to hurt the ones who support it.
They do it because they think adding it would bring negative consequences.
I never believed the 'You don't have to use it' as a silver bullet
explanation for why it's OK to add features with potentially negative
implications.

The good news is that I think that in many ways the ideas we're toying with
right now are better for the strict-type camp, especially if we end up going
for just one mode, and meet roughly mid-way in terms of strict and weak -
which I think is doable. The biggest gripes strict campers had with weak
mode are gone in this proposal, and unlike v0.3 - that would actually be the
default (and only) behavior, which is a big gain for the strict campers.
And the most prominent features of weak typing are kept (dynamic type
conversion where it makes sense), hopefully making the weak campers happy
too.

But you implied that most objections were from people who don't want
strict
typing in PHP at all. And I disagree with that because it's a speculation,
which
in turn you are using to favor your weak-hints-only case (hence, twisting
it in
another direction).

I didn't imply it now (at least I certainly didn't intend to). I did
outright say it a week or two ago, and still believe that's the case but
reached the conclusion that none of us would gain anything from further
discussing it. We won't know unless we start actually polling the people
who voted and ask, which we're not going to do, and we're obviously not
going to convince each other. Much more importantly, it at least seems as
if we have a direction for something that a very wide audience may rally
behind. Let's focus on that!

Zeev

10 years ago by Larry Garfield — view source

unread

Yes, I already know that.
The difference, and why I keep pointing that out, is that me and many
others
want strict typing for our own reasons (but still in its entirety instead
of as a
limited mode) and most of us don't even care if you getting weak typing
for
your own usage. You can't work towards consensus if your target is to
prevent the opposing group of getting what they want. I see both as
valuable
tools for different jobs and I want to have more tools at my disposal,
while
you're trying to tell me that I should use only one tool for everything.
First, it's very important to understand that my target is to prevent the
opposing group from getting what they want. I'm really not sadistic :) My
reasons were obviously different and worked towards a different goal. Much
in the same way that people who vote against an RFC - one of the countless
that were voted against - don't do that to hurt the ones who support it.
They do it because they think adding it would bring negative consequences.
I never believed the 'You don't have to use it' as a silver bullet
explanation for why it's OK to add features with potentially negative
implications.

The good news is that I think that in many ways the ideas we're toying with
right now are better for the strict-type camp, especially if we end up going
for just one mode, and meet roughly mid-way in terms of strict and weak -
which I think is doable. The biggest gripes strict campers had with weak
mode are gone in this proposal, and unlike v0.3 - that would actually be the
default (and only) behavior, which is a big gain for the strict campers.
And the most prominent features of weak typing are kept (dynamic type
conversion where it makes sense), hopefully making the weak campers happy
too.

But you implied that most objections were from people who don't want
strict
typing in PHP at all. And I disagree with that because it's a speculation,
which
in turn you are using to favor your weak-hints-only case (hence, twisting
it in
another direction).
I didn't imply it now (at least I certainly didn't intend to). I did
outright say it a week or two ago, and still believe that's the case but
reached the conclusion that none of us would gain anything from further
discussing it. We won't know unless we start actually polling the people
who voted and ask, which we're not going to do, and we're obviously not
going to convince each other. Much more importantly, it at least seems as
if we have a direction for something that a very wide audience may rally
behind. Let's focus on that!

Zeev

At this point, if I could rephrase the "camps" a bit I see two different
sets of priorities:

PHP should do what seems "obviously safe" to do, to make life easiest
for developers. That is, it's patently obvious that "32" and 32 are
equivalent, so don't make developers worry about the distinction because
to them there isn't one. This is an entirely reasonable position.
PHP would benefit hugely from static analysis tools and compile-time
type-based optimizations, but those are only possible with code that is
strongly typed. Currently such tools do not really exist, but with
compile-time-knowlable information could be written and even
incorporated into future versions of PHP without API breaks. (I think
Anthony demonstrated earlier examples of function calls no longer being
slow, for instance, if the type juggling could be removed at compile
time.) This is an entirely reasonable position.

Naturally those two positions are mutually exclusive; if the compiler
has to allow for "32" to be converted to 32 at runtime, it can't
optimize the opcodes by removing the code that would do that conversion!

I was against the mixed-mode approach before, but I given the above I am
warming to it. The trade off here is between DX (in the sense of the
code "doing what I mean" and not babysitting type information) and
potential performance and bug-finding benefits. Different places in an
application may need different trade-offs. In practice, the closer you
are to an IO action (browser input, database, file, etc.) the more you
want the "obviously safe" behavior; once you pass one layer of specified
typing you can be pretty confident that strict checks will "just work"
from there on out.

In essence, opt-in-strict becomes an opt-in "compiler, be pedantic so
you can make my code faster" flag. More carrot than stick, since people
can control when they opt-in to fancier compiler optimizations at the
cost of some DX, but only in some cases.

I started this email planning to ask Anthony how flexible strict
checking could get without losing the benefits of it, but I think I've
just convinced myself the answer is "not very". Which then leaves only
the question of internal functions that Rasmus raised, which... it looks
like is discussed in later emails so I will try to catch up on those. :-)

--Larry Garfield

10 years ago by Stanislav Malyshev — view source

unread

Hi!

PHP would benefit hugely from static analysis tools and compile-time
type-based optimizations, but those are only possible with code that is
strongly typed. Currently such tools do not really exist, but with

Is that really the case? Javascript has very good optimizing engine, and
Javascript has no typing. Of course, it is probably easier with strict
types, but harder and impossible are two very different things.

In fact, I do not see large benefits for static analysis from scalar
typing - execution path rarely is different depending on if something is
integer or string, except for obvious is_* check - but those are rarely
controlling any useful logic. It may be different depending on if it's a
scalar or an object - but that we already have covered.

Stas Malyshev
smalyshev@gmail.com

10 years ago by Zeev Suraski — view source

unread

-----Original Message-----
From: Larry Garfield [mailto:larry@garfieldtech.com]
Sent: Thursday, February 19, 2015 9:00 AM
To: internals@lists.php.net
Subject: Re: [PHP-DEV] Reviving scalar type hints

Yes, I already know that.
At this point, if I could rephrase the "camps" a bit I see two different
sets of
priorities:

PHP should do what seems "obviously safe" to do, to make life easiest
for
developers. That is, it's patently obvious that "32" and 32 are
equivalent, so
don't make developers worry about the distinction because to them there
isn't one. This is an entirely reasonable position.

PHP would benefit hugely from static analysis tools and compile-time
type-based optimizations, but those are only possible with code that is
strongly typed. Currently such tools do not really exist, but with
compile-
time-knowlable information could be written and even incorporated into
future versions of PHP without API breaks. (I think Anthony demonstrated
earlier examples of function calls no longer being slow, for instance, if
the
type juggling could be removed at compile
time.) This is an entirely reasonable position.

Larry,

There's actually very little difference between coercive type hinting and
strict type hinting in terms of performance. If you read what both Dmitry
and Anthony said, it should be clear that the vast majority of gains can be
had even without any sort of type hinting at all - and as Stas pointed out,
JavaScript has some mind blowing JIT optimizations without any explicit type
info at all.

Moreover, I think it's easy to lose the forest from the trees here, by
focusing on a very narrow piece of code - without looking at the bigger
picture.

Ultimately, if you have a piece of data that you want to pass from a caller
to a callee, it could be under one of three labels:

A piece of data the callee can use as-is.
A piece of data the callee can use after conversion (be it explicit or
implicit).
A piece of data the callee cannot/shouldn't use.

When comparing strict and coercive type hints, there's no difference between
them in terms of #1; There's a subtle difference with #3 - but only in the
error situation. In other words, for coercive type hints, it would just
take a bit more time before they fail, because they have to conduct a few
more checks. However, that's an error situation anyway, which is either
already going to bail out, or go through error handling code - which would
be very slow anyway.

So focusing on #2, in a practical real world situation - the difference is
actually a lot more subtle than people might think if they only zoom into on
the area around parameter passing. The bigger picture is, what would the
code author - the one making the call - want to do, semantically? In other
words, if you have "32" coming from a database or whatnot, are you likely to
want an API that accepts an int to be able to use that? I think the answer
is almost always yes. So practically, what will happen with strict typing
is that you'd explicitly cast it to int, while with coercive typing - you'd
rely on the language to do it for you. Arguably, very little difference
between the two in terms of performance. Note that it's possible people
will be able to come up with various edge cases where strict typing might
somehow alert you to a situation that may push you to change your code in a
way it might end up being slightly faster. But those will be edge cases and
should be taken in the context - in the vast majority of code patterns,
there's zero difference between the two approaches in terms of performance.

In terms of functionality, however, there's actually a substantial
difference between the two - explicit casting is a lot more aggressive than
the coercion rules we're thinking about for coercive type hints. It'll
happily and silently coerce "Apple" into 0, "100 dogs" into 100, and 3.1415
into 3.

Now, diving back to future potential AOT/JIT, it's simply not true that
there's any gain at all from strict typing - or at least, neither Dmitry
(who wrote a full JIT compiler for PHP that runs Mandelbrot as fast as gcc
does) nor me were able to understand them. Anthony spoke about being able
to completely eliminate the zval container and all associated checks, so
that in certain situations you'd be able to map a PHP integer all the way
down to a C (or asm) integer. That can certainly be done, but it has
nothing to do with strict vs. coercive type hints. Here's why:

At this point I think it's clear to everyone that inside the called
function, there's zero difference between strict and coercive typing (or
even the weak typing we were talking about earlier). They're 100%
guaranteed to receive what they asked, either because values were coerced or
blocked from even making it into the function.
On the outside calling code - if you can conduct the level of type
inference that would enable you to safely compile a PHP integer into a
machine code integer, by all means - do it; While at it, generate slightly
different function calling code that would bypass zval type checks
altogether, and provide that function with the integer it wanted.

Note that in his JIT POC, Dmitry managed to conduct a lot of this without
any type hinting at all, so while type hints (be them
strict/coercive/weak) make this job a bit easier - they're hardly required;
Nor do they solve the bigger challenging problem - which is type inference
in the various functions' code bodies themselves - since we don't have
variable declarations or strong typing in PHP.

Naturally those two positions are mutually exclusive; if the compiler has
to
allow for "32" to be converted to 32 at runtime, it can't optimize the
opcodes by removing the code that would do that conversion!

In essence, opt-in-strict becomes an opt-in "compiler, be pedantic so you
can
make my code faster" flag. More carrot than stick, since people can
control
when they opt-in to fancier compiler optimizations at the cost of some DX,
but only in some cases.

I hope what I said above illustrates why it's a misperception - and I think
it's a widely spread one. If your data source has the wrong type, and you
still want to use it - you'd have to convert it. The cost would be similar
whether it's done automatically by the language for you, or done manually
through an explicit cast - the latter being significantly more likely to
hide bugs. If people are in favor of strict typing because they think it
can help generate faster code - they should understand it's a misperception
and focus on the functionality instead!

I started this email planning to ask Anthony how flexible strict checking
could
get without losing the benefits of it, but I think I've just convinced
myself the
answer is "not very". Which then leaves only the question of internal
functions that Rasmus raised, which... it looks like is discussed in later
emails
so I will try to catch up on those. :-)

I hope I can convince you back :)
Given that are no substantial performance gains for strict typing vs.
coercive typing, again, no performance gains from strict vs. coercive
typing, we're really talking about functionality here.

I actually think the strict camp has a lot to gain from the single, fairly
strict but not as strict as zval.type comparison. Most notably - the vast
majority of use cases that were brought up by strict typing proponents, such
as rejecting lossy conversions ("100 dogs" -> 100, 37.7 -> 37, etc.) and
rejecting 'inventive' conversions (like bool->anything) - will not only be
supported, but they would be the default, and actually only available
behavior. That is compared with the currently proposed RFC, where strict
typing would have to be explicitly enabled. I also think that avoiding the
proliferation of explicit casts - that is bound to happen by people
adjusting their code to be strict compliant in a hurry - is a big gain for
many strict typing proponents.

It's true that there may certain use cases that coercive type hints may make
more difficult - such as static analysis (I'm not entirely sure why that is,
but I never dived into that) - but that in itself isn't a good enough
reason, IMHO, to introduce a second, separate mode that deals with scalars
in such a different way than the rest of PHP.

Obviously, I think 'weak' campers have a lot to gain too - by making
sensible conversions work fine as expected, without having to resort to
explicit casts.
And everyone stands to gain from having just one mode, instead of two.
The coercive typing approach would require each camp to give up a bit of
their 'ideology', but it also gives both schools of thought most of what
they want, including the key tenets for each camp (rejecting non-sensible
conversions - always, allowing sensible ones - always). I believe that's
what makes it a good compromise, a better one than the currently proposed
RFC.

Thanks!

Zeev

10 years ago by Lester Caine — view source

unread

Obviously, I think 'weak' campers have a lot to gain too - by making
sensible conversions work fine as expected, without having to resort to
explicit casts.
And everyone stands to gain from having just one mode, instead of two.
The coercive typing approach would require each camp to give up a bit of
their 'ideology', but it also gives both schools of thought most of what
they want, including the key tenets for each camp (rejecting non-sensible
conversions - always, allowing sensible ones - always). I believe that's
what makes it a good compromise, a better one than the currently proposed
RFC.

Now that all made sense!

My only grey area is 'allowing sensible ones' where the size is an
integral part of what is 'sensible' ... the one where conventional
strict typing uses a type of the right size?

--
Lester Caine - G8HFL

10 years ago by francois@php.net — view source

unread

Hi Lester,

I didn't add restrictions specific to number representation in the draft ruleset yet, becausen while I think that's an important point, I didn't have time to study it in depth.

I know you're an expert on this as you continuously (rightly) raised the point.

So, can you elaborate on this and send me or, better, publish on the list the detailed set of changes you suggest, including 32 bit vs 64 bit concerns if they fit. Today, conversion restrictions are rather limited as floats which don't fit in int give 0, and int to float is considered as always possible. I mean that must be technically incorrect, while unnoticed in the vast majority of cases.

So can you write a consistent set of changes you would introduce ?

Thanks

François

-----Message d'origine-----
De : Lester Caine [mailto:lester@lsces.co.uk]
Envoyé : jeudi 19 février 2015 11:24
À : internals@lists.php.net
Objet : Re: [PHP-DEV] Reviving scalar type hints

Obviously, I think 'weak' campers have a lot to gain too - by making
sensible conversions work fine as expected, without having to resort to
explicit casts.
And everyone stands to gain from having just one mode, instead of two.
The coercive typing approach would require each camp to give up a bit of
their 'ideology', but it also gives both schools of thought most of what
they want, including the key tenets for each camp (rejecting non-sensible
conversions - always, allowing sensible ones - always). I believe that's
what makes it a good compromise, a better one than the currently
proposed
RFC.

Now that all made sense!

My only grey area is 'allowing sensible ones' where the size is an
integral part of what is 'sensible' ... the one where conventional
strict typing uses a type of the right size?

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

10 years ago by Lester Caine — view source

unread

I didn't add restrictions specific to number representation in the draft ruleset yet, becausen while I think that's an important point, I didn't have time to study it in depth.

I know you're an expert on this as you continuously (rightly) raised the point.

So, can you elaborate on this and send me or, better, publish on the list the detailed set of changes you suggest, including 32 bit vs 64 bit concerns if they fit. Today, conversion restrictions are rather limited as floats which don't fit in int give 0, and int to float is considered as always possible. I mean that must be technically incorrect, while unnoticed in the vast majority of cases.

So can you write a consistent set of changes you would introduce ?

François ... My main interest in this over the years has been while
trying to port data between different engines, and that includes porting
legacy data from older systems such as dbase some of which originates
from 8 bit processor days.

One area that overlays the general discussion is the simple size of a
scalar value, and this just mirrors the number of bytes used to store
it. I1, I2, I4 and I8, but a more general validation facility on top of
'integer' would be a simple range.

decimal or numeric and float may get lumped together, but integer is a
much better base for decimal/numeric since it is lossless and simply
moves the position of the decimal point. Storage only requires an
integer of the appropriate size for the number of digits.

float is another miss used definition since it normally only allows for
32 bits of data, while double expands that to 64 bits. So once again the
platform may affect how these values are processed.

Pierre - You may think that the size of integer is irrelevant in the
discussion on providing a strict type use of it, but in practice it is
somewhat important if the codebase you are creating is only 32bit but
all the data being handled is 64bit.

--
Lester Caine - G8HFL

10 years ago by Zeev Suraski — view source

unread

-----Original Message-----
From: Lester Caine [mailto:lester@lsces.co.uk]
Sent: Thursday, February 19, 2015 12:24 PM
To: internals@lists.php.net
Subject: Re: [PHP-DEV] Reviving scalar type hints

Obviously, I think 'weak' campers have a lot to gain too - by making
sensible conversions work fine as expected, without having to resort
to explicit casts.
And everyone stands to gain from having just one mode, instead of two.
The coercive typing approach would require each camp to give up a bit
of their 'ideology', but it also gives both schools of thought most
of what they want, including the key tenets for each camp (rejecting
non-sensible conversions - always, allowing sensible ones - always).
I believe that's what makes it a good compromise, a better one than
the currently proposed RFC.

Now that all made sense!

My only grey area is 'allowing sensible ones' where the size is an
integral part
of what is 'sensible' ... the one where conventional strict typing uses a
type
of the right size?

I think the guiding principal for these conversions should be no data loss.
This may mean we have different limits on different architectures, depending
on whether they're 32-bit or 64-bit.

Zeev

10 years ago by Lester Caine — view source

unread

Now that all made sense!

My only grey area is 'allowing sensible ones' where the size is an
integral part
of what is 'sensible' ... the one where conventional strict typing uses a
type
of the right size?
I think the guiding principal for these conversions should be no data loss.
This may mean we have different limits on different architectures, depending
on whether they're 32-bit or 64-bit.

This still leaves the 'black hole' caused by the fact that databases are
actively using 'BIGINT' even on 32 bit platforms. It may be that the
only practical approach is gmp, but using that and writing code that
selects that on a 32bit platform, then switching to clean 64bit maths on
a 64bit platform does not sound like simplifying things?

As with other debates, some say ignore 32 bit, and others say lets loose
the constraints altogether, but having a fundamental type behave
differently depending on platform is a problem?

--
Lester Caine - G8HFL

10 years ago by Pierre Joye — view source

unread

Now that all made sense!

My only grey area is 'allowing sensible ones' where the size is an
integral part
of what is 'sensible' ... the one where conventional strict typing uses a
type
of the right size?
I think the guiding principal for these conversions should be no data loss.
This may mean we have different limits on different architectures, depending
on whether they're 32-bit or 64-bit.

This still leaves the 'black hole' caused by the fact that databases are
actively using 'BIGINT' even on 32 bit platforms. It may be that the
only practical approach is gmp, but using that and writing code that
selects that on a 32bit platform, then switching to clean 64bit maths on
a 64bit platform does not sound like simplifying things?

As with other debates, some say ignore 32 bit, and others say lets loose
the constraints altogether, but having a fundamental type behave
differently depending on platform is a problem?

Lester,

You keep coming with this topic in every possible threads.

As I understand that DB interactions are some of the primary usages of
PHP, this discussion has nothing to do with that. DB drivers will do
what they have to do to deal with PHP & DB types, as they always do.
It would be very good if you could focus on the actual content of a
RFC or discussion instead.

--
Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Larry Garfield — view source

unread

-----Original Message-----
From: Larry Garfield [mailto:larry@garfieldtech.com]
Sent: Thursday, February 19, 2015 9:00 AM
To: internals@lists.php.net
Subject: Re: [PHP-DEV] Reviving scalar type hints

Yes, I already know that.
At this point, if I could rephrase the "camps" a bit I see two different
sets of
priorities:

PHP should do what seems "obviously safe" to do, to make life easiest
for
developers. That is, it's patently obvious that "32" and 32 are
equivalent, so
don't make developers worry about the distinction because to them there
isn't one. This is an entirely reasonable position.

PHP would benefit hugely from static analysis tools and compile-time
type-based optimizations, but those are only possible with code that is
strongly typed. Currently such tools do not really exist, but with
compile-
time-knowlable information could be written and even incorporated into
future versions of PHP without API breaks. (I think Anthony demonstrated
earlier examples of function calls no longer being slow, for instance, if
the
type juggling could be removed at compile
time.) This is an entirely reasonable position.
Larry,

There's actually very little difference between coercive type hinting and
strict type hinting in terms of performance. If you read what both Dmitry
and Anthony said, it should be clear that the vast majority of gains can be
had even without any sort of type hinting at all - and as Stas pointed out,
JavaScript has some mind blowing JIT optimizations without any explicit type
info at all.

Moreover, I think it's easy to lose the forest from the trees here, by
focusing on a very narrow piece of code - without looking at the bigger
picture.

Ultimately, if you have a piece of data that you want to pass from a caller
to a callee, it could be under one of three labels:

A piece of data the callee can use as-is.

A piece of data the callee can use after conversion (be it explicit or
implicit).

A piece of data the callee cannot/shouldn't use.

When comparing strict and coercive type hints, there's no difference between
them in terms of #1; There's a subtle difference with #3 - but only in the
error situation. In other words, for coercive type hints, it would just
take a bit more time before they fail, because they have to conduct a few
more checks. However, that's an error situation anyway, which is either
already going to bail out, or go through error handling code - which would
be very slow anyway.

So focusing on #2, in a practical real world situation - the difference is
actually a lot more subtle than people might think if they only zoom into on
the area around parameter passing. The bigger picture is, what would the
code author - the one making the call - want to do, semantically? In other
words, if you have "32" coming from a database or whatnot, are you likely to
want an API that accepts an int to be able to use that? I think the answer
is almost always yes. So practically, what will happen with strict typing
is that you'd explicitly cast it to int, while with coercive typing - you'd
rely on the language to do it for you. Arguably, very little difference
between the two in terms of performance. Note that it's possible people
will be able to come up with various edge cases where strict typing might
somehow alert you to a situation that may push you to change your code in a
way it might end up being slightly faster. But those will be edge cases and
should be taken in the context - in the vast majority of code patterns,
there's zero difference between the two approaches in terms of performance.

In terms of functionality, however, there's actually a substantial
difference between the two - explicit casting is a lot more aggressive than
the coercion rules we're thinking about for coercive type hints. It'll
happily and silently coerce "Apple" into 0, "100 dogs" into 100, and 3.1415
into 3.

Now, diving back to future potential AOT/JIT, it's simply not true that
there's any gain at all from strict typing - or at least, neither Dmitry
(who wrote a full JIT compiler for PHP that runs Mandelbrot as fast as gcc
does) nor me were able to understand them. Anthony spoke about being able
to completely eliminate the zval container and all associated checks, so
that in certain situations you'd be able to map a PHP integer all the way
down to a C (or asm) integer. That can certainly be done, but it has
nothing to do with strict vs. coercive type hints. Here's why:

At this point I think it's clear to everyone that inside the called
function, there's zero difference between strict and coercive typing (or
even the weak typing we were talking about earlier). They're 100%
guaranteed to receive what they asked, either because values were coerced or
blocked from even making it into the function.

On the outside calling code - if you can conduct the level of type
inference that would enable you to safely compile a PHP integer into a
machine code integer, by all means - do it; While at it, generate slightly
different function calling code that would bypass zval type checks
altogether, and provide that function with the integer it wanted.

Note that in his JIT POC, Dmitry managed to conduct a lot of this without
any type hinting at all, so while type hints (be them
strict/coercive/weak) make this job a bit easier - they're hardly required;
Nor do they solve the bigger challenging problem - which is type inference
in the various functions' code bodies themselves - since we don't have
variable declarations or strong typing in PHP.

Naturally those two positions are mutually exclusive; if the compiler has
to
allow for "32" to be converted to 32 at runtime, it can't optimize the
opcodes by removing the code that would do that conversion!

In essence, opt-in-strict becomes an opt-in "compiler, be pedantic so you
can
make my code faster" flag. More carrot than stick, since people can
control
when they opt-in to fancier compiler optimizations at the cost of some DX,
but only in some cases.
I hope what I said above illustrates why it's a misperception - and I think
it's a widely spread one. If your data source has the wrong type, and you
still want to use it - you'd have to convert it. The cost would be similar
whether it's done automatically by the language for you, or done manually
through an explicit cast - the latter being significantly more likely to
hide bugs. If people are in favor of strict typing because they think it
can help generate faster code - they should understand it's a misperception
and focus on the functionality instead!

I started this email planning to ask Anthony how flexible strict checking
could
get without losing the benefits of it, but I think I've just convinced
myself the
answer is "not very". Which then leaves only the question of internal
functions that Rasmus raised, which... it looks like is discussed in later
emails
so I will try to catch up on those. :-)
I hope I can convince you back :)
Given that are no substantial performance gains for strict typing vs.
coercive typing, again, no performance gains from strict vs. coercive
typing, we're really talking about functionality here.

I actually think the strict camp has a lot to gain from the single, fairly
strict but not as strict as zval.type comparison. Most notably - the vast
majority of use cases that were brought up by strict typing proponents, such
as rejecting lossy conversions ("100 dogs" -> 100, 37.7 -> 37, etc.) and
rejecting 'inventive' conversions (like bool->anything) - will not only be
supported, but they would be the default, and actually only available
behavior. That is compared with the currently proposed RFC, where strict
typing would have to be explicitly enabled. I also think that avoiding the
proliferation of explicit casts - that is bound to happen by people
adjusting their code to be strict compliant in a hurry - is a big gain for
many strict typing proponents.

It's true that there may certain use cases that coercive type hints may make
more difficult - such as static analysis (I'm not entirely sure why that is,
but I never dived into that) - but that in itself isn't a good enough
reason, IMHO, to introduce a second, separate mode that deals with scalars
in such a different way than the rest of PHP.

Obviously, I think 'weak' campers have a lot to gain too - by making
sensible conversions work fine as expected, without having to resort to
explicit casts.
And everyone stands to gain from having just one mode, instead of two.
The coercive typing approach would require each camp to give up a bit of
their 'ideology', but it also gives both schools of thought most of what
they want, including the key tenets for each camp (rejecting non-sensible
conversions - always, allowing sensible ones - always). I believe that's
what makes it a good compromise, a better one than the currently proposed
RFC.

Thanks!

Zeev

Thank you for the detailed reply, Zeev.

I am not a language engineer myself, so I can't speak to how or if
full-static would be more performant. I am mostly relying on the
statement of others such as Anthony that it would be the case and trying
to summarize/rephrase the camps in terms of the desired benefit (DX and
performance/correctness) rather than the implementation ("weak" vs
"strong"). If it's possible to mostly have our cake and eat it too, I'm
all for that. Anthony and Stas are discussing the details of that in
the (now-misnamed) spin-off thread and much of it is sadly over my head.

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to be
the main point of contention: Whether or not there are real, practical
benefits to be had in the compiler of knowing that a call will be in
"strict mode". (If there are, then the split-mode makes sense If there
are not, then there's little benefit to it.)

Either way, I agree 100% with Zeev that we can/should tighten up the
coercion logic. In 16 years of writing PHP I have never once had a
situation where using "99 red balloons" in a context that wants an
integer wasn't a bug.

10 years ago by francois@php.net — view source

unread

Hi Larry,

Thanks for trying to sort this out. This is very valuable work.

De : Larry Garfield [mailto:larry@garfieldtech.com]

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler?

I will let Anthony reply as the question is for him, but can you precide which compiler you're talking about ? The compiler layer of the PHP interpreter, or a future AOT compiler generating native executable code from PHP code ? I guess the reply will be different in both cases.

Regards

François

10 years ago by Anthony Ferrara — view source

unread

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to be the
main point of contention: Whether or not there are real, practical benefits
to be had in the compiler of knowing that a call will be in "strict mode".
(If there are, then the split-mode makes sense If there are not, then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time. So that means one less branch
(no guard) per argument per function call. In addition, native calls
can be used in a lot of cases, which means the compiled code doesn't
even need to know about a zval (significant memory and access
reduction). This has potential to be significant. Not to mention the
other optimizations that are possible.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Anthony

10 years ago by Dmitry Stogov — view source

unread

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara ircmaxell@gmail.com
wrote:

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in "strict
mode".
(If there are, then the split-mode makes sense If there are not, then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter type,
may eliminate guard check independently on hint semantic (strong or week).
If you don't know first or second you'll have to generate guard code anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

So that means one less branch
(no guard) per argument per function call. In addition, native calls
can be used in a lot of cases, which means the compiled code doesn't
even need to know about a zval (significant memory and access
reduction). This has potential to be significant. Not to mention the
other optimizations that are possible.

This already worked for as without type hinting.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Completely agree, changing language for compiler is not fair.
It's clear that statically typed languages are more suitable but we won't
make PHP statically typed.
Also, modern JS engines showed - what they may do without typing.

In my opinion strict type hints may be useful for program verification, but
again, I wouldn't like to change the whole language semantic just to get
few unit tests out of the box.

Thanks. Dmitry.

Anthony

10 years ago by Pierre Joye — view source

unread

hi Dmitry,

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara ircmaxell@gmail.com
wrote:

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in "strict
mode".
(If there are, then the split-mode makes sense If there are not, then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter type,
may eliminate guard check independently on hint semantic (strong or week).
If you don't know first or second you'll have to generate guard code anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

May, could, would, all that are totally irrelevant to the debate about
type hinting. The speed benefit is not significant.

I think we can agree on that, and we did as far as I can tell :)

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Completely agree, changing language for compiler is not fair.
It's clear that statically typed languages are more suitable but we won't
make PHP statically typed.
Also, modern JS engines showed - what they may do without typing.

Let put things correctly please:

In my opinion strict type hints may be useful for program verification, but
again, I wouldn't like to change the whole language semantic

We are talking about arguments handling here. Not the whole language
semantic. The way the language works will stay the same. I am not
writing that for you but for all other who may be misinterpret your
reply.

just to get few unit tests out of the box.

Strict types handling for arguments goes way beyond having a few units
tests. It would very good if one single point of the argumentation is
used to generalize a cons argument. That makes no sense and it simply
goes down a way I would really not like to see again.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Dmitry Stogov — view source

unread

hi Dmitry,

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara ircmaxell@gmail.com
wrote:

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in "strict
mode".
(If there are, then the split-mode makes sense If there are not, then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter
type,
may eliminate guard check independently on hint semantic (strong or
week).
If you don't know first or second you'll have to generate guard code
anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

May, could, would, all that are totally irrelevant to the debate about
type hinting. The speed benefit is not significant.

What is significant? Miracle ability of static analyzes for AOT?

I think we can agree on that, and we did as far as I can tell :)

I didn't agree with you.
Probably, I told that performance impact of run-time switch of type hinting
semantic is slightly negative and it would be great to fix it if possible.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Completely agree, changing language for compiler is not fair.
It's clear that statically typed languages are more suitable but we won't
make PHP statically typed.
Also, modern JS engines showed - what they may do without typing.

Let put things correctly please:

In my opinion strict type hints may be useful for program verification,
but
again, I wouldn't like to change the whole language semantic

We are talking about arguments handling here. Not the whole language
semantic. The way the language works will stay the same. I am not
writing that for you but for all other who may be misinterpret your
reply.

just to get few unit tests out of the box.

Strict types handling for arguments goes way beyond having a few units
tests. It would very good if one single point of the argumentation is
used to generalize a cons argument. That makes no sense and it simply
goes down a way I would really not like to see again.

I didn't hear any arguments for strict typing except for program
verification and static analyzes, may be I missed.
Please, tell me few use cases, may be it'll change my mind :)

Thanks. Dmitry.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Anthony Ferrara — view source

unread

Dmitry

hi Dmitry,

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara ircmaxell@gmail.com
wrote:

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to
be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in "strict
mode".
(If there are, then the split-mode makes sense If there are not,
then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter
type,
may eliminate guard check independently on hint semantic (strong or
week).
If you don't know first or second you'll have to generate guard code
anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

May, could, would, all that are totally irrelevant to the debate about
type hinting. The speed benefit is not significant.

What is significant? Miracle ability of static analyzes for AOT?

Please, can we discuss something without snark? And can we get past
AOT? It's distracting. I only mentioned it here because I was
specifically asked about it. It's not in my RFC. So please, let's get
past it.

I think we can agree on that, and we did as far as I can tell :)

I didn't agree with you.
Probably, I told that performance impact of run-time switch of type hinting
semantic is slightly negative and it would be great to fix it if possible.

Everyone is saying this shouldn't be voted on based on performance.
You, me, Pierre, everyone.

Additionally, the negative impact could be solved by introducing a new
opcode for scalar checks, pushing any performance difference to
compile time. But I'd like to see some measurments of the performance
difference prior to going down that road. In short, the negative
performance difference is either going to be negligible (won't appear
on a benchmark) or can more than likely be made negligible without a
terrible amount of work.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Completely agree, changing language for compiler is not fair.
It's clear that statically typed languages are more suitable but we
won't
make PHP statically typed.
Also, modern JS engines showed - what they may do without typing.

Let put things correctly please:

In my opinion strict type hints may be useful for program verification,
but
again, I wouldn't like to change the whole language semantic

We are talking about arguments handling here. Not the whole language
semantic. The way the language works will stay the same. I am not
writing that for you but for all other who may be misinterpret your
reply.

just to get few unit tests out of the box.

Strict types handling for arguments goes way beyond having a few units
tests. It would very good if one single point of the argumentation is
used to generalize a cons argument. That makes no sense and it simply
goes down a way I would really not like to see again.

I didn't hear any arguments for strict typing except for program
verification and static analyzes, may be I missed.
Please, tell me few use cases, may be it'll change my mind :)

verification and static analysis aren't enough?

Seriously, equating static analysis to "a few unit tests" is either
un-unnecessarily hyperbolic or a complete misconsurance of the point.
"Unit tests" at best tell you that the code behaves as the tests say.
Those tests can be bogus, but the tests still pass. Static analysis on
the other hand can tell you if the code is semantically correct or not
(whether or not errors can/will be thrown). The type system provides a
lower bound on correctness. The "unit tests" at best establish an
upper bound.

For a 1,000 line-of-code application written by a single developer,
the difference is trivial. For a 10,000 line-of-code application
written by 2 developers, it's likely not going to make a massive
difference.

But for a 100,000 line-of-code or 1,000,000 line-of-code or 10,000,000
line-of-code application written and maintained by dozens of
developers, it becomes massively important. There's a reason large
companies are investing massive amounts of money into statically typed
languages (Hack, Go, Rust, etc). For a sense of scale, Symfony 2
framework in PHP is 129838 non-comment lines of code. With over 1000
contributors.

That's not saying you should want to use statically typed for
everything. And nor would I support PHP moving to pure statically
typed (which is why the proposal I'm backing doesn't). Instead, what
I'm suggesting is to keep with the philosophy of PHP 5's object
system. Strongly typed when user choose, but it's optional. That way
the user is empowered to make the decisions best for them.

Anthony

hi Dmitry,

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara ircmaxell@gmail.com
wrote:

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to
be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in "strict
mode".
(If there are, then the split-mode makes sense If there are not,
then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter
type,
may eliminate guard check independently on hint semantic (strong or
week).
If you don't know first or second you'll have to generate guard code
anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

May, could, would, all that are totally irrelevant to the debate about
type hinting. The speed benefit is not significant.

What is significant? Miracle ability of static analyzes for AOT?

I think we can agree on that, and we did as far as I can tell :)

I didn't agree with you.
Probably, I told that performance impact of run-time switch of type hinting
semantic is slightly negative and it would be great to fix it if possible.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Completely agree, changing language for compiler is not fair.
It's clear that statically typed languages are more suitable but we
won't
make PHP statically typed.
Also, modern JS engines showed - what they may do without typing.

Let put things correctly please:

In my opinion strict type hints may be useful for program verification,
but
again, I wouldn't like to change the whole language semantic

We are talking about arguments handling here. Not the whole language
semantic. The way the language works will stay the same. I am not
writing that for you but for all other who may be misinterpret your
reply.

just to get few unit tests out of the box.

Strict types handling for arguments goes way beyond having a few units
tests. It would very good if one single point of the argumentation is
used to generalize a cons argument. That makes no sense and it simply
goes down a way I would really not like to see again.

I didn't hear any arguments for strict typing except for program
verification and static analyzes, may be I missed.
Please, tell me few use cases, may be it'll change my mind :)

Thanks. Dmitry.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Zeev Suraski — view source

unread

verification and static analysis aren't enough?

Anthony,

While IMHO they're not enough to warrant substantial deviation from PHP's behavior, this is a subjective question that others might answer differently.

But there's also an objective issue. There's a serious question mark whether the type of hint - strict, coercive of otherwise can have any sort of implications on one's ability to conduct static analysis, JIT or AOT (I'm bringing those up again since they're closely related in terms of what you can or cannot infer).

Now, I'll contend that even though I don't think we are, perhaps we're missing something. But at the very least it should be clear to the list there's serious doubt on whether there's any extra value there even if they do seem static analysis critical. If there is, it's likely to be very, very limited in scope.

That's not saying you should want to use statically typed for
everything. And nor would I support PHP moving to pure statically
typed (which is why the proposal I'm backing doesn't).

We're on the same page here. But the kinds of static analysis benefits you seem to believe we can get from strict type hints would require that - strong typing, variable declarations, perhaps changes to casting rules - not just around that narrow interface between callers and callees. Thankfully that's not on the table.

Thanks,

Zeev

10 years ago by Anthony Ferrara — view source

unread

Zeev,

verification and static analysis aren't enough?

Anthony,

While IMHO they're not enough to warrant substantial deviation from PHP's behavior, this is a subjective question that others might answer differently.

But there's also an objective issue. There's a serious question mark whether the type of hint - strict, coercive of otherwise can have any sort of implications on one's ability to conduct static analysis, JIT or AOT (I'm bringing those up again since they're closely related in terms of what you can or cannot infer).

Now, I'll contend that even though I don't think we are, perhaps we're missing something. But at the very least it should be clear to the list there's serious doubt on whether there's any extra value there even if they do seem static analysis critical. If there is, it's likely to be very, very limited in scope.

Let's simply agree to disagree here :-)

That's not saying you should want to use statically typed for
everything. And nor would I support PHP moving to pure statically
typed (which is why the proposal I'm backing doesn't).

We're on the same page here. But the kinds of static analysis benefits you seem to believe we can get from strict type hints would require that - strong typing, variable declarations, perhaps changes to casting rules - not just around that narrow interface between callers and callees. Thankfully that's not on the table.

That's also not necessary in most cases. You can infer a lot about the
types of variables just having arguments declared. In most cases, you
can infer enough for static analysis to work. In the cases you can't,
that's actually a valid result of the analysis because you may have
undefined behavior. Example:

function foo(string $a): int {
return $a + 1;
}

You can't infer the type of $a+1 because the conversion of $a->numeric
that happens is unstable from a type perspective. But PHP's type
changes are predictable enough where the majority of sane cases are
predictable.

Both Swift and Go behave like this. Where you only need explicit
declarations on the arguments, the rest can be inferred. And where it
can't infer, it raises a type error.

Anthony

10 years ago by Dmitry Stogov — view source

unread

Hi Anthony,

On Fri, Feb 20, 2015 at 5:55 PM, Anthony Ferrara ircmaxell@gmail.com
wrote:

Dmitry

On Fri, Feb 20, 2015 at 5:08 PM, Pierre Joye pierre.php@gmail.com
wrote:

hi Dmitry,

On Thu, Feb 19, 2015 at 11:13 PM, Dmitry Stogov dmitry@zend.com
wrote:

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara <ircmaxell@gmail.com

wrote:

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to
be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in "strict
mode".
(If there are, then the split-mode makes sense If there are not,
then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better
cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter
type,
may eliminate guard check independently on hint semantic (strong or
week).
If you don't know first or second you'll have to generate guard code
anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

May, could, would, all that are totally irrelevant to the debate about
type hinting. The speed benefit is not significant.

What is significant? Miracle ability of static analyzes for AOT?

Please, can we discuss something without snark? And can we get past
AOT? It's distracting. I only mentioned it here because I was
specifically asked about it. It's not in my RFC. So please, let's get
past it.

sorry, I shouldn't be too emotional.
actually, it's hard to express emotions with may bad English :)
I think, you are doing a great job regarding AOT, but I think strict types
won't help you a lot.
We may discuss it out of the list on next week.

I think we can agree on that, and we did as far as I can tell :)

I didn't agree with you.
Probably, I told that performance impact of run-time switch of type
hinting
semantic is slightly negative and it would be great to fix it if
possible.

Everyone is saying this shouldn't be voted on based on performance.
You, me, Pierre, everyone.

Performance is not a stopper for vote, but it doesn't mean it's not
important.

Additionally, the negative impact could be solved by introducing a new
opcode for scalar checks, pushing any performance difference to
compile time. But I'd like to see some measurments of the performance
difference prior to going down that road. In short, the negative
performance difference is either going to be negligible (won't appear
on a benchmark) or can more than likely be made negligible without a
terrible amount of work.

Currently type checks are performed in ZEND_RECV opcode, so we can't use a
special opcode here.
We may try to perform checks in ZEND_SEND_... opcodes. If this would work
without performance degradation it's great.
But we will have to verify all the possible caller mechanism carefully,
e.g. direct calls, indirect through call_user_function(), __call(),
__get(), error callbacks, etc. Then we won't need to check types in RECV.
If you may try to implement this, it would be great.

Thanks. Dmitry.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Completely agree, changing language for compiler is not fair.
It's clear that statically typed languages are more suitable but we
won't
make PHP statically typed.
Also, modern JS engines showed - what they may do without typing.

Let put things correctly please:

In my opinion strict type hints may be useful for program
verification,
but
again, I wouldn't like to change the whole language semantic

We are talking about arguments handling here. Not the whole language
semantic. The way the language works will stay the same. I am not
writing that for you but for all other who may be misinterpret your
reply.

just to get few unit tests out of the box.

Strict types handling for arguments goes way beyond having a few units
tests. It would very good if one single point of the argumentation is
used to generalize a cons argument. That makes no sense and it simply
goes down a way I would really not like to see again.

I didn't hear any arguments for strict typing except for program
verification and static analyzes, may be I missed.
Please, tell me few use cases, may be it'll change my mind :)

verification and static analysis aren't enough?

Seriously, equating static analysis to "a few unit tests" is either
un-unnecessarily hyperbolic or a complete misconsurance of the point.
"Unit tests" at best tell you that the code behaves as the tests say.
Those tests can be bogus, but the tests still pass. Static analysis on
the other hand can tell you if the code is semantically correct or not
(whether or not errors can/will be thrown). The type system provides a
lower bound on correctness. The "unit tests" at best establish an
upper bound.

For static analyzes you actually don't need strict types in the language
itslef.
If static analyzer would find type mismatch (that may work with weak
hinting rules), it may report a warning.
then users may add explicit type conversion.

For a 1,000 line-of-code application written by a single developer,
the difference is trivial. For a 10,000 line-of-code application
written by 2 developers, it's likely not going to make a massive
difference.

But for a 100,000 line-of-code or 1,000,000 line-of-code or 10,000,000
line-of-code application written and maintained by dozens of
developers, it becomes massively important. There's a reason large
companies are investing massive amounts of money into statically typed
languages (Hack, Go, Rust, etc). For a sense of scale, Symfony 2
framework in PHP is 129838 non-comment lines of code. With over 1000
contributors.

But PHP is dynamically typed language and it shouldn't be turned into
static.
Think about our users, why do they use PHP? because it's simple to use a
scripting language.

That's not saying you should want to use statically typed for

everything. And nor would I support PHP moving to pure statically
typed (which is why the proposal I'm backing doesn't). Instead, what
I'm suggesting is to keep with the philosophy of PHP 5's object
system. Strongly typed when user choose, but it's optional. That way
the user is empowered to make the decisions best for them.

Argument type hinting is just a first step to make stronger type system,
but strong != static.

Thanks. Dmitry.

Anthony

On Fri, Feb 20, 2015 at 5:08 PM, Pierre Joye pierre.php@gmail.com
wrote:

hi Dmitry,

On Thu, Feb 19, 2015 at 11:13 PM, Dmitry Stogov dmitry@zend.com
wrote:

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara <ircmaxell@gmail.com

wrote:

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to
be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in "strict
mode".
(If there are, then the split-mode makes sense If there are not,
then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better
cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter
type,
may eliminate guard check independently on hint semantic (strong or
week).
If you don't know first or second you'll have to generate guard code
anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

May, could, would, all that are totally irrelevant to the debate about
type hinting. The speed benefit is not significant.

What is significant? Miracle ability of static analyzes for AOT?

I think we can agree on that, and we did as far as I can tell :)

I didn't agree with you.
Probably, I told that performance impact of run-time switch of type
hinting
semantic is slightly negative and it would be great to fix it if
possible.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Completely agree, changing language for compiler is not fair.
It's clear that statically typed languages are more suitable but we
won't
make PHP statically typed.
Also, modern JS engines showed - what they may do without typing.

Let put things correctly please:

In my opinion strict type hints may be useful for program
verification,
but
again, I wouldn't like to change the whole language semantic

We are talking about arguments handling here. Not the whole language
semantic. The way the language works will stay the same. I am not
writing that for you but for all other who may be misinterpret your
reply.

just to get few unit tests out of the box.

Strict types handling for arguments goes way beyond having a few units
tests. It would very good if one single point of the argumentation is
used to generalize a cons argument. That makes no sense and it simply
goes down a way I would really not like to see again.

I didn't hear any arguments for strict typing except for program
verification and static analyzes, may be I missed.
Please, tell me few use cases, may be it'll change my mind :)

Thanks. Dmitry.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Pierre Joye — view source

unread

Hi Anthony,

On Fri, Feb 20, 2015 at 5:55 PM, Anthony Ferrara ircmaxell@gmail.com
wrote:

Dmitry

On Fri, Feb 20, 2015 at 5:08 PM, Pierre Joye pierre.php@gmail.com
wrote:

hi Dmitry,

On Thu, Feb 19, 2015 at 11:13 PM, Dmitry Stogov dmitry@zend.com
wrote:

On Fri, Feb 20, 2015 at 4:57 AM, Anthony Ferrara
ircmaxell@gmail.com
wrote:

Larry,

Anthony, can you expand here at all about the practical benefits
of
strong-typing for variable passing for the compiler? That seems
to
be
the
main point of contention: Whether or not there are real, practical
benefits
to be had in the compiler of knowing that a call will be in
"strict
mode".
(If there are, then the split-mode makes sense If there are not,
then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better
cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be
a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time.

AOT compiler that know type of passed argument and expected parameter
type,
may eliminate guard check independently on hint semantic (strong or
week).
If you don't know first or second you'll have to generate guard code
anyway
independently from hint semantic (strong or week). Is this wrong?

We may introduce strong type hints because of your mistake.

May, could, would, all that are totally irrelevant to the debate about
type hinting. The speed benefit is not significant.

What is significant? Miracle ability of static analyzes for AOT?

Please, can we discuss something without snark? And can we get past
AOT? It's distracting. I only mentioned it here because I was
specifically asked about it. It's not in my RFC. So please, let's get
past it.

sorry, I shouldn't be too emotional.
actually, it's hard to express emotions with may bad English :)
I think, you are doing a great job regarding AOT, but I think strict types
won't help you a lot.
We may discuss it out of the list on next week.

if any place, on EFNet, if not the ML must remain the place to discuss things.

I think we can agree on that, and we did as far as I can tell :)

I didn't agree with you.
Probably, I told that performance impact of run-time switch of type
hinting
semantic is slightly negative and it would be great to fix it if
possible.

Everyone is saying this shouldn't be voted on based on performance.
You, me, Pierre, everyone.

Performance is not a stopper for vote, but it doesn't mean it's not
important.

It is, only not in this context unless the patch changed in such ways
that the performance impact becomes relevant. However, as of now, it
is not. See Matt's perf results using the usual suspects.

Cheers,
Pierre

10 years ago by francois@php.net — view source

unread

Hi Anthony,

I guess you would keep supporting __toString() ? So, you should probably consider 'string' as 'string|object'.
Adding this case to 'float' meaning 'int|float' and 'callable' resolving to 'string|array|object', are you sure it's worth the pain implementing and supporting a dual-mode mechanism, compared to the ruleset I am intending to propose (currently in draft): https://wiki.php.net/rfc/zpp-conversion-rules ?

Actually, using such ruleset, I guess you could infer less, but the difference wouldn't be so important.

Only 3 conversions still use value, from which one can be made type-dependent-only if requested during discussion (float to int proposed as lossless-only). Every type except int, float, and string are also proposed as 100% strict.

I'm not sure we can go much further with a single-mode approach, but I'll appreciate your opinion. Of course, anyone else is welcome too.

Regards

François

-----Message d'origine-----
De : Anthony Ferrara [mailto:ircmaxell@gmail.com]
Envoyé : vendredi 20 février 2015 02:58
À : Larry Garfield
Cc : internals@lists.php.net
Objet : Re: [PHP-DEV] Reviving scalar type hints

Larry,

Anthony, can you expand here at all about the practical benefits of
strong-typing for variable passing for the compiler? That seems to be the
main point of contention: Whether or not there are real, practical benefits
to be had in the compiler of knowing that a call will be in "strict mode".
(If there are, then the split-mode makes sense If there are not, then
there's little benefit to it.)

For the normal compiler & engine there will be no benefit for the
foreseeable future.

For a tracing JIT compiler, there will be no advantage.

For a local JIT compiler, there can be some optimizations around
reduced conversion logic generated (and hence potentially better cache
efficiency, etc). A guard would still be generated, but that's a
single branch rather than the full cast logic. This would likely be a
small gain (likely less than 1%, possibly significantly less).

For a AOT compiler (optimizing compiler), more optimizations and
therefore gains can be had. The big difference here is that type
assertions can be done at compile time. So that means one less branch
(no guard) per argument per function call. In addition, native calls
can be used in a lot of cases, which means the compiled code doesn't
even need to know about a zval (significant memory and access
reduction). This has potential to be significant. Not to mention the
other optimizations that are possible.

However, I think making this decision based on performance is the
incorrect way of doing it. For the Zend engine, there will be no
discernible difference between the proposals. It's a red herring. The
difference I would focus on is the ability to statically analyze the
code (with the benefits that comes with).

Anthony

10 years ago by Anthony Ferrara — view source

unread

Francois,

Hi Anthony,

I guess you would keep supporting __toString() ? So, you should probably consider 'string' as 'string|object'.
Adding this case to 'float' meaning 'int|float' and 'callable' resolving to 'string|array|object', are you sure it's worth the pain implementing and supporting a dual-mode mechanism, compared to the ruleset I am intending to propose (currently in draft): https://wiki.php.net/rfc/zpp-conversion-rules ?

No. The rules are quite clearly explained in the RFC. The only thing
that can resolve a string hint is a string. __toString is a form of
cast, and hence disallowed in strict mode (because 1: it's not
explicit, 2: it loses significant information).

Callable is an existing rule, which is not subject to the "strict" discussion.

Actually, using such ruleset, I guess you could infer less, but the difference wouldn't be so important.

Only 3 conversions still use value, from which one can be made type-dependent-only if requested during discussion (float to int proposed as lossless-only). Every type except int, float, and string are also proposed as 100% strict.

None of the conversions use "value". Int, Float, String and Bool all
behave precisely about type. No runtime value information is used
(only runtime type info).

Please read the proposal and see what it says. Don't judge it by what
you assume, but by what it says.

Anthony

10 years ago by Anthony Ferrara — view source

unread

Francois,

Adding this case to 'float' meaning 'int|float' and 'callable' resolving to 'string|array|object', are you sure it's worth the pain implementing and supporting a dual-mode mechanism, compared to the ruleset I am intending to propose (currently in draft): https://wiki.php.net/rfc/zpp-conversion-rules ?

I've taken a look at that proposal, and here are my comments:

This RFC only talks about ZPP. I assume you're also talking about
exposing the same ruleset to userland?
You're talking about making passing 1 to a bool parameter
disallowed by default. Meaning that a LOT of code will break
automatically when upgrading to 7. Because internal function's error
behavior will change.

This will likely create a python 2/3 situation here. You're making
a MASSIVE bc break here.

If you were just talking about user-land types, that would be one
thing. But since you're explicitly talking about ZPP, this is a bad
thing.

It still doesn't solve the problems the strict type proponents are
looking to have solved (namely that the type is what's checked, not
the value). The is_numeric_string cleanup is definitely a step in the
right direction (and I think should be implemented anyway), but it
still doesn't solve the problem.

Overall, this proposal feels like a compromise in a bad way. It makes
existing code error (bc break), doesn't give the weak proponents what
they want and doesn't give the strict proponents what they want
either. Just because it's the middle ground doesn't make it a good
thing.

Please consider these issues seriously prior to making the proposal official.

Anthony

10 years ago by Jordi Boggiano — view source

unread

If we want to add a "numeric" type as a virtual union of int and
float, that's one way to solve the concern. If we don't, we could also
allow widening primitive conversion (int -> float). That wouldn't work
well with bigints, but would be fine in other cases. But there are
plenty of languages that always require explicit type conversion. So
even if we choose that, we're in good company.

As far as I understand, allowing int -> float would help fix a few of
Rasmus' (or was it Benjamin?) concerns about things like sin() requiring
floats in strict mode. It would make strict mode a lot more usable with
C code as well. +1 on keeping the strictness benefits and remove some of
the drawbacks.

Having numeric in addition might be nice mostly for return values I
guess, but it seems a bit redundant to me if int -> float is allowed.

As for the straw poll, I also think declare() is the clearest syntax,
especially if it's enforced to appear at most once and on top of the
file to remove any potential misuses.

Cheers

--
Jordi Boggiano
@seldaek - http://nelm.io/jordi

10 years ago by francois@php.net — view source

unread

Hi,

Starting making the strict mode weaker is nonsense. It is not just Rasmus' example. Which exception do you authorize then ? Would you define a third 'pseudo-strict' mode ? And about static analyzers, will they be 'pseudo-strict' too ?

No, if you want strict, it can only remain purely strict. I can understand the 'strict' position, but not the desire to make it weaker. Do you understand that, starting from the purely-strict position, I can give you use cases which will cause you to define additional exceptions, until you get the same conversion rules as I am currently defining ! The only difference is that I don't claim it being 'strict'.

And 'numeric' is either a new ZPP type, or an alias for a future union type. Nothing to do with pure strict mode.

These are all hacks trying to solve specific use cases.

Regards

François

-----Message d'origine-----
De : Jordi Boggiano [mailto:j.boggiano@seld.be]
Envoyé : mardi 17 février 2015 17:26
À : internals@lists.php.net
Objet : Re: [PHP-DEV] Reviving scalar type hints

If we want to add a "numeric" type as a virtual union of int and
float, that's one way to solve the concern. If we don't, we could also
allow widening primitive conversion (int -> float). That wouldn't work
well with bigints, but would be fine in other cases. But there are
plenty of languages that always require explicit type conversion. So
even if we choose that, we're in good company.

As far as I understand, allowing int -> float would help fix a few of
Rasmus' (or was it Benjamin?) concerns about things like sin() requiring
floats in strict mode. It would make strict mode a lot more usable with
C code as well. +1 on keeping the strictness benefits and remove some of
the drawbacks.

Having numeric in addition might be nice mostly for return values I
guess, but it seems a bit redundant to me if int -> float is allowed.

As for the straw poll, I also think declare() is the clearest syntax,
especially if it's enforced to appear at most once and on top of the
file to remove any potential misuses.

Cheers

--
Jordi Boggiano
@seldaek - http://nelm.io/jordi

10 years ago by Lester Caine — view source

unread

A static analyzer (one of the reasons people want strict) would error
there. The reason is that at compile time it can't reason about the
code well enough to determine if there's an error or not. You're
passing a string where you expect an int. Is that going to work? We
don't know. So the analyzer would need to throw a warning that the
cast is potentially unsafe because it can't guarantee that the runtime
won't throw an error. Which means that to remove the warning you'd
need to add an explicit cast.

This is what I do not get ...

I have no idea what string will be provided, so either we get a valid
number or we don't. Conversion of the string to a number needs to follow
several rules such as thousand divider or decimal point, or perhaps even
spelt out numbers. If the input string can't be converted to a number
then we need the error message - explicit casting fixed nothing - you
can't eliminate the error if the passed string can't be converted so you
need an alternate escape route. A generic 'type fault' does not help
since you want to return the fault to the inputting source? Static
analysis only works if you assume there is no human involvement in the
generation of the inputs?

Now if you are handling the problem of mistakes in the input prior to
the function call you already know that the data is good so why do you
need 'strict' at all. You will process the string, decide if you want to
tell the client that it must be a whole number, or if it exceeds some
limit. That may even be done in javascript in the browser, so what is
fed back has already been sanitised. It will come in as a string and you
know it is a valid number ...

--
Lester Caine - G8HFL

10 years ago by francois@php.net — view source

unread

Hi Anthony,

I understand your concerns and think the work we are doing should be suitable to compilers and static analyzers, but that's hard to make everyone happy.

One thing I thought we should reserve for a future release is the addition of a set of strict type hints : something like (just example syntax) 'int!', 'float!', 'string!', 'bool!' (other types are strict already). These types would have their counterpart at the ZPP level and their parsing rule would be to reject everything except the corresponding zval type.

This way, the user can write :

function convertToInt(string $number): int! { <- int! instead of int
if (!preg_match("(^[0-9]{1,17}$)", $number)) {
throw new InvalidArgumentException("Supplied argument is not a valid number");
}
return $number;
}

Which makes it usable for static analysis, I guess. These would be primarily used for return types.

Another valuable type would be something like 'numeric!', which could accept IS_LONG and IS_FLOAT only. This would be interesting for static analysis, to know that it cannot be a string, while accepting any zval numeric value. This is a little harder because it should be implemented as an alias of 'int!|float!', and we wanted to reserve union types for a future release.

About using zval type and value, I don't like it too, and I would prefer using types only, but I see no way of keeping accepting "31" as int without depending of the runtime value. Additional strict types, as proposed above, may be a partial solution because those would never care about value.

However, the more I think about it, the more I think that this set of strict types will be necessary to design union types, as defining a set of rules to convert a zval to a union of weak types, while possible, will be a mess. So, future union types could be strict-only, which will make them suitable for analysis.

I'm afraid this is probably not what you expect but, as I previously said, I'm trying to satisfy as many people as possible. Maybe you're right and '0.3 or nothing' proponents are a 'crowd', but I am not doing politics. If this is the case, the 'crowd' will win and this will be perfect.

Now, if you want to collaborate on the compromise we are trying to build, I really value your opinion. I don't pretend to be right in any way and I understand your pov but, if you stay on '0.3 or nothing', I cannot do much more.

Regards

François

10 years ago by francois@php.net — view source

unread

De : François Laupretre [mailto:francois@php.net]

This way, the user can write :

function convertToInt(string $number): int! { <- int! instead of int
if (!preg_match("(^[0-9]{1,17}$)", $number)) {
throw new InvalidArgumentException("Supplied argument is not a valid
number");
}
return $number;

A static analyzer would raise an error on this, as we are sure it fails, while :

function convertToInt(string $number): int! {
if (!preg_match("(^[0-9]{1,17}$)", $number)) {
throw new InvalidArgumentException("Supplied argument is not a valid
number");
}
return (int)$number;

would be analyzed as OK. That's what you want, don't you ?

Regards

François

10 years ago by Martin Keckeis — view source

unread

2015-02-17 0:58 GMT+01:00 Sara Golemon pollita@php.net:

On Mon, Feb 16, 2015 at 2:50 PM, François Laupretre francois@php.net
wrote:

Straw poll:

<?php strict;

<?php-strict

use strict; (psuedo-namespace)

<?php // strict (I don't actually like HHVM's style, but if you do...)

declare(strict=true); (As a top-level declare only)

declare(strict=true); (exactly as in v0.3 -- maybe you liked it)

your write-in vote here

(i changed double 3 voting option in quote)

+1 for all which can only be defined on the "first line" so: 1,2, 4, 5

10 years ago by Derick Rethans — view source

unread

Once again, anyone can take over version 0.3, if it is so great. Why don't you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

Please do!

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's needs whether they would admit it or not, and who's only
failing (subjectively termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to
be, but I'm willing to read the winds and modify it for v0.4.

Straw poll:

These two are my preference:

use strict; (psuedo-namespace)

declare(strict=true); (As a top-level declare only)

cheers,
Derick

10 years ago by Benoit Schildknecht — view source

unread

Le Tue, 17 Feb 2015 00:58:18 +0100, Sara Golemon pollita@php.net a écrit:

On Mon, Feb 16, 2015 at 2:50 PM, François Laupretre francois@php.net
wrote:

Once again, anyone can take over version 0.3, if it is so great. Why
don't you do it ?
I will play the game, stop working on my proposal, and vote 'yes' again.
But don't ask me to do it in your place.

If nobody else does it, I will.

I think Andrea's 0.3 proposal was extremely well balanced, served
everyone's needs whether they would admit it or not, and who's only
failing (subjectively termed) was the use of declare(). I think
declare() is fine and not nearly as ugly as some have slandered it to
be, but I'm willing to read the winds and modify it for v0.4.

Straw poll:

<?php strict;

<?php-strict

use strict; (psuedo-namespace)

<?php // strict (I don't actually like HHVM's style, but if you do...)

declare(strict=true); (As a top-level declare only)

declare(strict=true); (exactly as in v0.3 -- maybe you liked it)

your write-in vote here

I'm not going to scope in union types, nullables, or falsables. We
can leave that for a followup RFC, this one is contentious enough as
it is.

-Sara

I like 2) No possible confusion, and it's a clear tag.

But implementing 3) would be a good thing, since it is Hack syntax. Even
if I don't like to use comments to enable something.

If we have 2 similar features between PHP and Hack, I think it should have
the same syntax, so there are minimum BC from one language to another, and
people would spend less time to remember which syntax is the right syntax
for PHP or Hack.

10 years ago by Sanford Whiteman — view source

unread

I like 2) No possible confusion, and it's a clear tag.

I agree, but it feels like it gets away from PHP's underscore-heavy
syntax. The poll omitted <?php_strict -- that feels most PHP to me.

-- S.

Reviving scalar type hints

Two semantics in the same language are bad enough. Three, IMHO, is just a no go, dealing with code having three different semantics would be completely impossible.

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

-- Lester Caine - G8HFL

Cheers,

Cheers,

Cheers,

Cheers,

-- Lester Caine - G8HFL

Cheers,

Two semantics in the same language are bad enough. Three, IMHO, is just
a no go, dealing with code having three different semantics would be
completely impossible.

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL

--
Lester Caine - G8HFL