Good evening,
Parameter type hints for PHP’s scalar types are a long-requested feature for PHP. Today I am proposing an RFC which is a new attempt to add them to the language. It is my hope that we can finally get this done for PHP 7.
I’d like to thank Dmitry, who emailed me and gave me some feedback on the draft RFC and some improvements to the patch. He encouraged me to put this to internals sooner rather than later, as it’s a feature many people are hoping will be in PHP 7.
The new RFC can be found here: https://wiki.php.net/rfc/scalar_type_hints
As well as the RFC, there is a working Zend Engine III patch with tests, and an incomplete specification patch.
Please read the RFC (and specification patch, if you wish) and tell me your thoughts.
Thanks!
Andrea Faulds
http://ajf.me/
Parameter type hints for PHP’s scalar types are a long-requested feature for PHP. Today I am proposing an RFC which is a new attempt to add them to the language. It is my hope that we can finally get this done for PHP 7.
I’d like to thank Dmitry, who emailed me and gave me some feedback on the draft RFC and some improvements to the patch. He encouraged me to put this to internals sooner rather than later, as it’s a feature many people are hoping will be in PHP 7.
The new RFC can be found here: https://wiki.php.net/rfc/scalar_type_hints
At a first read through, this looks great, and much more in line with
what I'd like scalar type hints to look like. Nice job!
In terms of the open issues, here's what I think:
-
Aliases: I think we should support the same set of names as we
currently support for type casts[0] (excluding non-scalar types,
obviously) — this actually expands the list a little more, since there
are three (!) variants for floating point values, but I think it's
important to be consistent in all the places in the language where we
use "type" names. -
Prohibiting use in class names: I think this makes sense, otherwise
we'll be violating the principle of least surprise when somebody does
call a class Int. (Yes, namespacing might help here, but I think I'd
rather restrict it altogether rather than trying to come up with rules
around when you can call a class "int" and how you'd refer to it.)
To bang another drum (and this shouldn't be responded to in this
thread, since we have another one going for this RFC): this sort of
change is why 5.7 is vital for migration — emitting deprecation
warnings for users calling classes by names that we'll be prohibiting
in PHP 7 is important, and will help our users migrate their code
bases.
I haven't looked at the patches yet, but assuming they're good, my
initial feeling is +1. Good work on taming the beast. :)
Adam
[0] http://php.net/manual/en/language.types.type-juggling.php#language.types.typecasting
Hey Adam,
At a first read through, this looks great, and much more in line with
what I'd like scalar type hints to look like. Nice job!
Glad to hear that!
In terms of the open issues, here's what I think:
- Aliases: I think we should support the same set of names as we
currently support for type casts[0] (excluding non-scalar types,
obviously) — this actually expands the list a little more, since there
are three (!) variants for floating point values, but I think it's
important to be consistent in all the places in the language where we
use "type" names.
I had thought about this. I don’t really want to support multiple names for each type. I’m alright with allowing short and long forms (int/integer, bool/boolean), because it’s fairly obvious to users that they mean the same thing, and we don’t use either form consistently. However, I don’t like completely different type names like ‘long’, ‘double’ and ‘real’. Luckily, in master, we no longer use ‘long’ or ‘double’ in error messages, and I don’t think we’ve ever used ‘real’ anywhere outside of the cast operator. So, I don’t think it’d cause problems to not allow these as type hints.
- Prohibiting use in class names: I think this makes sense, otherwise
we'll be violating the principle of least surprise when somebody does
call a class Int. (Yes, namespacing might help here, but I think I'd
rather restrict it altogether rather than trying to come up with rules
around when you can call a class "int" and how you'd refer to it.)To bang another drum (and this shouldn't be responded to in this
thread, since we have another one going for this RFC): this sort of
change is why 5.7 is vital for migration — emitting deprecation
warnings for users calling classes by names that we'll be prohibiting
in PHP 7 is important, and will help our users migrate their code
bases.
I hadn’t thought of that, but yes, having a deprecation notice for this in 5.7 would be a good idea.
Thanks for your comments.
--
Andrea Faulds
http://ajf.me/
Hi!
I like it, it's consistent and to the point. If we must have scalar
typed parameters (I'm not sure but if we do), IMO that's the way to do it.
The issue is the class names though. E.g. see:
https://github.com/ralphschindler/zf2-db/blob/master/research/ColumnType/Integer.php
and:
https://github.com/canaydogan/ObjectValidator/blob/master/src/ObjectValidator/Validator/Int.php
These are uppercase "I", but class names are not case-sensitive... So
we'll need to figure it out. Unfortunately, looking at github,
disallowing "class Int" looks like pretty bad idea. We can make it
lowercase-only, but this would be a bit weird since it's not sensitive
in all other places (unless we change that which is another huge can of
worms).
BTW, right now this code:
function foo(integer $a) { var_dump($a); }
foo(1);
produces this message:
Catchable fatal error: Argument 1 passed to foo() must be an instance of
integer, integer given
which is pretty confusing I'd say :) We'd probably want to rephrase that
message (yes, I know, patching all those tests yet again...)
Stas Malyshev
smalyshev@gmail.com
Hey Stas,
I like it, it's consistent and to the point. If we must have scalar
typed parameters (I'm not sure but if we do), IMO that's the way to do it.
Glad you like it.
The issue is the class names though. E.g. see:
https://github.com/ralphschindler/zf2-db/blob/master/research/ColumnType/Integer.phpand:
https://github.com/canaydogan/ObjectValidator/blob/master/src/ObjectValidator/Validator/Int.php
These are uppercase "I", but class names are not case-sensitive... So
we'll need to figure it out. Unfortunately, looking at github,
disallowing "class Int" looks like pretty bad idea. We can make it
lowercase-only, but this would be a bit weird since it's not sensitive
in all other places (unless we change that which is another huge can of
worms).
Yeah, it’s a problem. I think some breakage here is inevitable, unfortunately. Some of the classes with these names are stand-ins for scalar type hints, so that code can “just” migrate to using actual hints. But this doesn’t apply to all of them.
We could choose to simply not prohibit them as class names, but that creates a weird inconsistency where you can make ‘class Integer’ yet ‘function foo(Integer $a)’ hints against the integer type, not your class. Type hints are very widely used, so I doubt this would help anyone, and we’d still be breaking existing code type hinting against such classes.
There’s not much we can do really. I suppose there is one positive outcome, in that hopefully when broken code is updated to work, it might have more descriptive class names. :)
BTW, right now this code:
function foo(integer $a) { var_dump($a); }
foo(1);produces this message:
Catchable fatal error: Argument 1 passed to foo() must be an instance of
integer, integer givenwhich is pretty confusing I'd say :) We'd probably want to rephrase that
message (yes, I know, patching all those tests yet again…)
We could rephrase it, but I don’t think it’s that bad. Once scalar hints are in, you won’t get that error any more. I think it could be worse as errors go, at least it distinguishes semi-clearly between objects (“instance of integer”) and other types (“integer”).
Thanks!
Andrea Faulds
http://ajf.me/
Hi!
Yeah, it’s a problem. I think some breakage here is inevitable,
unfortunately. Some of the classes with these names are stand-ins for
scalar type hints, so that code can “just” migrate to using actual
hints. But this doesn’t apply to all of them.
Breaking ZF2 and all software built on it is not "some breakage", it's a
serious issue which would produce a big barrier for PHP 7 migration. And
looks like there are more frameworks that do the same. This would be a
barrier to PHP 7 adoption, and note that is even for people that
couldn't care less for scalar typing. We'd find ourselves in python 3
situation - where people would be glad to upgrade but they use library X
and it doesn't work and they have no idea how to fix it and they keep
all their development on the old version and the new one never catches
on. It'd be a shame if we spend all this effort on PHP 7 and get no
adoption since people can't run their existing code on it.
We could choose to simply not prohibit them as class names, but that
creates a weird inconsistency where you can make ‘class Integer’ yet
‘function foo(Integer $a)’ hints against the integer type, not your
class. Type hints are very widely used, so I doubt this would help
anyone, and we’d still be breaking existing code type hinting against
such classes.
I'd rather make the hints case sensitive. In fact, of two BC breakages
making classes case sensitive may be the lesser one (I'm not a big fan
of either but at least the modern frameworks would probably all work and
if some code does not it's possible to auto-fix it).
There’s not much we can do really. I suppose there is one positive
outcome, in that hopefully when broken code is updated to work, it
might have more descriptive class names. :)
The issue here is that people that now use ZF2 or other framework like
that won't rewrite it. They would just stay on PHP 5.
Stas Malyshev
smalyshev@gmail.com
Hi Stas,
Hi!
Yeah, it’s a problem. I think some breakage here is inevitable,
unfortunately. Some of the classes with these names are stand-ins for
scalar type hints, so that code can “just” migrate to using actual
hints. But this doesn’t apply to all of them.Breaking ZF2 and all software built on it is not "some breakage", it's a
serious issue which would produce a big barrier for PHP 7 migration. And
looks like there are more frameworks that do the same. This would be a
barrier to PHP 7 adoption, and note that is even for people that
couldn't care less for scalar typing. We'd find ourselves in python 3
situation - where people would be glad to upgrade but they use library X
and it doesn't work and they have no idea how to fix it and they keep
all their development on the old version and the new one never catches
on. It'd be a shame if we spend all this effort on PHP 7 and get no
adoption since people can't run their existing code on it.
I wouldn’t say it’s impossible to work around. You could rename the class to something which doesn’t conflict, but add a conditional class_alias for PHP 5. Codebases needing to work on both PHP 5 and PHP 7 can switch to the new name, codebases only needing to work on PHP 5 can stick with the old name.
Does that sound workable?
We could choose to simply not prohibit them as class names, but that
creates a weird inconsistency where you can make ‘class Integer’ yet
‘function foo(Integer $a)’ hints against the integer type, not your
class. Type hints are very widely used, so I doubt this would help
anyone, and we’d still be breaking existing code type hinting against
such classes.I'd rather make the hints case sensitive. In fact, of two BC breakages
making classes case sensitive may be the lesser one (I'm not a big fan
of either but at least the modern frameworks would probably all work and
if some code does not it's possible to auto-fix it).
If they were case-sensitive, this would be inconsistent with other type names like array and callable.
Thanks!
Andrea Faulds
http://ajf.me/
Hi!
If they were case-sensitive, this would be inconsistent with other type names like array and callable.
Of course, I mean making all these hints case-sensitive. I have never
seen code using uppercase Callable or Array, though I imagine it might
happen. Github's search is case-insensitive so I don't really know how
to check for it - though since I see absolutely no reason for doing that
(no documentation, examples, guides, best practices, etc. ever suggest
you should write Callable) I would guesstimate the impact would be small.
Stas Malyshev
smalyshev@gmail.com
ZF2 completely broke compatibility with ZF1 users, so I think this is a bad example.
Regards
Thomas
Stanislav Malyshev wrote on 02.01.2015 01:15:
Hi!
Yeah, it’s a problem. I think some breakage here is inevitable,
unfortunately. Some of the classes with these names are stand-ins for
scalar type hints, so that code can “just” migrate to using actual
hints. But this doesn’t apply to all of them.Breaking ZF2 and all software built on it is not "some breakage", it's a
serious issue which would produce a big barrier for PHP 7 migration. And
looks like there are more frameworks that do the same. This would be a
barrier to PHP 7 adoption, and note that is even for people that
couldn't care less for scalar typing. We'd find ourselves in python 3
situation - where people would be glad to upgrade but they use library X
and it doesn't work and they have no idea how to fix it and they keep
all their development on the old version and the new one never catches
on. It'd be a shame if we spend all this effort on PHP 7 and get no
adoption since people can't run their existing code on it.We could choose to simply not prohibit them as class names, but that
creates a weird inconsistency where you can make ‘class Integer’ yet
‘function foo(Integer $a)’ hints against the integer type, not your
class. Type hints are very widely used, so I doubt this would help
anyone, and we’d still be breaking existing code type hinting against
such classes.I'd rather make the hints case sensitive. In fact, of two BC breakages
making classes case sensitive may be the lesser one (I'm not a big fan
of either but at least the modern frameworks would probably all work and
if some code does not it's possible to auto-fix it).There’s not much we can do really. I suppose there is one positive
outcome, in that hopefully when broken code is updated to work, it
might have more descriptive class names. :)The issue here is that people that now use ZF2 or other framework like
that won't rewrite it. They would just stay on PHP 5.Stas Malyshev
smalyshev@gmail.com
Hi!
ZF2 completely broke compatibility with ZF1 users, so I think this is
a bad example.
We're talking about different things here. PHP is an universal platform
and PHP 7 would be offered as upgrade to all PHP users - running ZF1,
ZF2, Symphony, Drupal, anything. If there would be a sizeable chance
that their existing code would not run on PHP 7, people would not
upgrade. Our upgrade record is not stellar as it is, even with
extraordinary effort we put in keeping BC 5.4->5.6. If we break major
libraries in 7, I am afraid we'd have adoption problem.
ZF2 wasn't really an upgrade from ZF1 - nobody expected you to just jump
from ZF1 to ZF2 on the same code. So it's not the point here, the point
is that ZF2 is an example of major framework that uses the feature which
this RFC proposes to break. There are more.
Stas Malyshev
smalyshev@gmail.com
Stanislav Malyshev wrote on 02.01.2015 01:57:
Hi!
ZF2 completely broke compatibility with ZF1 users, so I think this is
a bad example.We're talking about different things here. PHP is an universal platform
and PHP 7 would be offered as upgrade to all PHP users - running ZF1,
ZF2, Symphony, Drupal, anything. If there would be a sizeable chance
that their existing code would not run on PHP 7, people would not
upgrade. Our upgrade record is not stellar as it is, even with
extraordinary effort we put in keeping BC 5.4->5.6. If we break major
libraries in 7, I am afraid we'd have adoption problem.ZF2 wasn't really an upgrade from ZF1 - nobody expected you to just jump
from ZF1 to ZF2 on the same code. So it's not the point here, the point
is that ZF2 is an example of major framework that uses the feature which
this RFC proposes to break. There are more.Stas Malyshev
smalyshev@gmail.com--
I don't see ZF2 as a big problem:
/tmp/zf/ZendFramework-2.3.3# grep -Erin "\(int|integer|bool|boolean|float|string)[^a-z0-9]|class (int|integer|bool|boolean|float|string)[^a-z0-9]" . | wc -l
27
Making a few changes here should not be a problem.
Regards
Thomas
The issue is the class names though. E.g. see:
https://github.com/ralphschindler/zf2-db/blob/master/research/ColumnType/Integer.php
That's in a namespace, so it's not actually Integer, but
Zend\Db\Metadata\Type\Integer
and:
https://github.com/canaydogan/ObjectValidator/blob/master/src/ObjectValidator/Validator/Int.php
That's in a namespace, so it's not actually Int, but
ObjectValidator\Validator\Int
These are uppercase "I", but class names are not case-sensitive... So
we'll need to figure it out. Unfortunately, looking at github,
disallowing "class Int" looks like pretty bad idea.
Sorry, but for years we have this in the manual
(http://php.net/manual/en/userlandnaming.rules.php):
"PHP owns the top-level namespace but tries to find decent descriptive
names and avoid any obvious clashes."
With namespaces (as in your above examples), this is already moot.
I think it's perfectly acceptable that PHP makes a built-in type a
reserved word. I would certainly not change it to PHPInt to "avoid any
obvious clashes".
cheers,
Derick
2015-01-04 18:31 GMT+01:00 Derick Rethans derick@php.net:
The issue is the class names though. E.g. see:
https://github.com/ralphschindler/zf2-db/blob/master/research/ColumnType/Integer.phpThat's in a namespace, so it's not actually Integer, but
Zend\Db\Metadata\Type\Integer
It might add some value to forbid importing of unaliased type named
classes via use, so that no ambiguous type annotation exists.
Example:
use My\Lib\string as UserString; // fine, no error
use My\Lib\int; // Compile error: Forbidden to import scalar type. Use
as
to create an alias
function test (int $a, UserString $b); // what is int?
Most libraries should be able to resolve such issues in a minor
release to stay compatible with PHP 7. If a user doesn't use PHP 7 he
won't be affected as the old import still works and the name of the
class (if namespaced) hasn't changed, if he switches to PHP7 the
updates can probably be automated.
Namespaced classes can still be named whatever they want as the
following(typehint an alias of the own class, using FQN) is already
valid and the signature of the method would not change(no major
release of a library required).:
namspace abc;
use abc\int as MyInt;
class int { function a(MyInt $a, \abc\int $b) ;}
"PHP owns the top-level namespace but tries to find decent descriptive
names and avoid any obvious clashes."
I don't think to many people wrap types in a class in the global
namespace. Those people who care about the possible benefits are
likely to use namespaces, pseudo/pear namespaces or more specialized
classes anyway.
Hey Sebastian,
It might add some value to forbid importing of unaliased type named
classes via use, so that no ambiguous type annotation exists.
Example:use My\Lib\string as UserString; // fine, no error
use My\Lib\int; // Compile error: Forbidden to import scalar type. Use
as
to create an alias
That’s something that was actually on my todo list. No point in forbidding class names if you can get around it with use
.
Once I do this in the patch I’ll update the RFC to say as much.
Most libraries should be able to resolve such issues in a minor
release to stay compatible with PHP 7. If a user doesn't use PHP 7 he
won't be affected as the old import still works and the name of the
class (if namespaced) hasn't changed, if he switches to PHP7 the
updates can probably be automated.
Yes, this is possible. You could just rename the class to something that’s OK in both PHP 5 and 7 (IntClass, e.g.), then add a conditional class_alias()
for PHP 5.
Actually, I’d need to update class_alias()
as well. Hmm. Maybe I should make some sort of internal is_class_name_permitted() function.
Namespaced classes can still be named whatever they want as the
following(typehint an alias of the own class, using FQN) is already
valid and the signature of the method would not change(no major
release of a library required).:namspace abc;
use abc\int as MyInt;class int { function a(MyInt $a, \abc\int $b) ;}
I don’t like the idea of allowing these within namespaces, because relative uses would still be broken. Better to just force people to fix their class names.
Thanks.
Andrea Faulds
http://ajf.me/
Hi!
That's in a namespace, so it's not actually Integer, but
Zend\Db\Metadata\Type\Integer
Right, but it doesn't matter - in the same namespace (and in ones with
suitable imports), foo(Integer $bar) still means the class, not the
primitive type, and that would be broken by this RFC. Also, if integer
becomes reserved word, you won't be able to write "class Integer"
anymore, regardless of the namespace.
With namespaces (as in your above examples), this is already moot.
I'm sorry, I don't see how this is moot if we still have the breakage as
described above.
I think it's perfectly acceptable that PHP makes a built-in type a
reserved word. I would certainly not change it to PHPInt to "avoid any
obvious clashes".
As I already pointed out, if we make it reserved word we'll have BC
problem. We already had this experience when goto became reserved word -
it resulted in a lot of breakage. This would be bigger, since we adding
more reserved words and we know they are used in a lot of class names
(and, possibly, method names too).
Stas Malyshev
smalyshev@gmail.com
Hi Stas,
I think it's perfectly acceptable that PHP makes a built-in type a
reserved word. I would certainly not change it to PHPInt to "avoid any
obvious clashes".As I already pointed out, if we make it reserved word we'll have BC
problem. We already had this experience when goto became reserved word -
it resulted in a lot of breakage. This would be bigger, since we adding
more reserved words and we know they are used in a lot of class names
(and, possibly, method names too).
It is a BC problem, yes. However, it can be worked around in userland in a way where unchanged code will continue to work on PHP 5, and changed code will work on both PHP 5 and PHP 7. I think the BC break is worth it, though unfortunate. I also wonder why we hadn’t reserved these names before, it seems rather silly that you were ever able to make a class called “integer” or “array”.
By the way, we might want to pre-reserve resource
as well. This RFC intentionally does not add a type hint for it, but we may wish to do so in future. It might also help with the migration to classes, too, as this would stop userland developers creating classes called “Resource” in the global namespace that would cause a future conflict.
Thoughts?
--
Andrea Faulds
http://ajf.me/
Hi Stas,
I think it's perfectly acceptable that PHP makes a built-in type a
reserved word. I would certainly not change it to PHPInt to "avoid any
obvious clashes".As I already pointed out, if we make it reserved word we'll have BC
problem. We already had this experience when goto became reserved word -
it resulted in a lot of breakage. This would be bigger, since we adding
more reserved words and we know they are used in a lot of class names
(and, possibly, method names too).It is a BC problem, yes. However, it can be worked around in userland in a
way where unchanged code will continue to work on PHP 5, and changed code
will work on both PHP 5 and PHP 7. I think the BC break is worth it, though
unfortunate. I also wonder why we hadn’t reserved these names before, it
seems rather silly that you were ever able to make a class called “integer”
or “array”.
Note that this kind of BC break will gladly be fixed in major libraries out
there: we can rename those classes and announce the breakages ourselves
without too many problems.
The best way to eagerly detect this sort of problem would be to have a
php-nightly version on travis-ci, and then we could handle it ourselves and
without too much pain (already trying to get there, and mbeccati has a CI
system running with php7 and some major libs).
The only trouble that could eventually come up is with legacy software that
isn't tested at all.
Marco Pivetta
Am 31.12.2014 um 21:27 schrieb Andrea Faulds:
Parameter type hints for PHP’s scalar types
Please use the term "type declaration for arguments" (or "type
declaration for parameters") instead of "type hints". If it's used
then it's not a hint.
Hello Sebastian,
Am 31.12.2014 um 21:27 schrieb Andrea Faulds:
Parameter type hints for PHP’s scalar types
Please use the term "type declaration for arguments" (or "type
declaration for parameters") instead of "type hints". If it's used
then it's not a hint.
Thanks for bringing this up.
Interestingly, from the RFC I had another issue with it: what it
technically does it neither "hinting" nor "declaration" but it tries
conversion and seem to have multiple action how it may continue:
- continue, everything as expected
- catchable error
- notices thrown
To Andrea:
As much as I'd like to see such a language feature like this, I think:
- the naming of the RFC thus the intent is confusing
What it really does is it tries its best to convert, e.g. the RFC reads
as it tries to work like this:
function foo(int $bar) {
$bar = (int)$bar;
}
From your description I understand that, technically, it doesn't do that
exactly; I was merely trying to make a point how it "looks it works";
see my next point.
- Casting and Validation Rules
"While this RFC merely follows PHP's existing rules for scalar
parameters, used by extension functions, these rules may not be familiar
to all readers of this RFC."
Very good job in pointing this out! Which brings me to this question:
are these different rules than the general casting rules (as I exampled
above) ? If yes, wouldn't this increase the burden on PHP developers
even more to learn new rules (and, those rules are already to many to
sanely remember).
- "Non-numeric strings not accepted. Numeric strings with trailing
characters are accepted, but produce a notice. "
That behavior, IMHO, is very bad for this kind of feature.
What's the point of continuing the code when developer asked for "int"
and code logic continues with "something not quite an int"?
Doesn't that defeat the whole purpose of the use of this RFC?
I know that this can be fixed using an error handler, throwing an
exception to abort code execution but that should really be an error
anyway, IMHO, on par with a gross mismatch of types.
I'm not in favor of these soft rules.
thanks,
- Markus
Hi Markus,
- the naming of the RFC thus the intent is confusing
What it really does is it tries its best to convert, e.g. the RFC reads
as it tries to work like this:function foo(int $bar) {
$bar = (int)$bar;
}From your description I understand that, technically, it doesn't do that
exactly; I was merely trying to make a point how it "looks it works";
see my next point.
I think it’d be weird to have different syntaxes for scalars and non-scalars. We already use the syntax the RFC proposes in the PHP manual, and I don’t think anyone’s confused by it.
I realise that the behaviour of scalar type hints is slightly different from the hints for complex types, but I think this would just look ugly:
function foo((int) $bar, Foo $baz, (float) $qux);
- Casting and Validation Rules
"While this RFC merely follows PHP's existing rules for scalar
parameters, used by extension functions, these rules may not be familiar
to all readers of this RFC."Very good job in pointing this out! Which brings me to this question:
are these different rules than the general casting rules (as I exampled
above) ? If yes, wouldn't this increase the burden on PHP developers
even more to learn new rules (and, those rules are already to many to
sanely remember).
Yes and no. With the exception of hexadecimal numbers in strings, explicit casts and internal functions follow the same rules for conversion. However, the validation rules don’t match. Explicit casts never fail, while internal functions will reject arguments that are non-scalar or don’t fit certain rules.
- "Non-numeric strings not accepted. Numeric strings with trailing
characters are accepted, but produce a notice. "That behavior, IMHO, is very bad for this kind of feature.
What's the point of continuing the code when developer asked for "int"
and code logic continues with "something not quite an int"?Doesn't that defeat the whole purpose of the use of this RFC?
I know that this can be fixed using an error handler, throwing an
exception to abort code execution but that should really be an error
anyway, IMHO, on par with a gross mismatch of types.
I can’t say I love that behaviour either. It is, however, our existing behaviour. I’d rather we be consistent with internal functions.
Thanks.
Andrea Faulds
http://ajf.me/
Hello Andrea,
I think it’d be weird to have different syntaxes for scalars and non-scalars. We already use the syntax the RFC proposes in the PHP manual, and I don’t think anyone’s confused by it.
I didn't meant to stay there's something wrong with the syntax, sorry if
my text was confusing! I was rather trying to point out that it does not
hint at anything but proactively tries to convert types; see below for more:
- Casting and Validation Rules
"While this RFC merely follows PHP's existing rules for scalar
parameters, used by extension functions, these rules may not be familiar
to all readers of this RFC."Yes and no. With the exception of hexadecimal numbers in strings, explicit casts and internal functions follow the same rules for conversion. However, the validation rules don’t match. Explicit casts never fail, while internal functions will reject arguments that are non-scalar or don’t fit certain rules.
[...]
- "Non-numeric strings not accepted. Numeric strings with trailing
characters are accepted, but produce a notice. "
[...]
What's the point of continuing the code when developer asked for "int"
and code logic continues with "something not quite an int"?Doesn't that defeat the whole purpose of the use of this RFC?
[...]
I can’t say I love that behaviour either. It is, however, our existing behaviour. I’d rather we be consistent with internal functions.
And as you also pointed out: those conversion rules are different from
what a plain "(int)$whatever" would do.
I completely understand the technical ratio here: internal function have
an existing argument/parsing system which is now exposed to the end
developer ("php user") here.
I argue this system is fit and works (and, err, talk BC; obviously!) for
the whole internal PHP function/method system but ..
.. does not reflect the needs and requirements of a php user/developer
because exactly of these implicit conversion rules with three different
output states (everything ok, hard type mismatch, notice on partial
conversion).
I'd also argue that scalar types in function signatures behaving
differently than object type hints is potentially a bad thing for future
of PHP and I already made my point which is the most important to me:
What's the point of continuing the code when developer asked for "int"
and code logic continues with "something not quite an int"?
Now, going on step back here (talking about me), I'm speaking up because
my< needs are developer are different (mostly speaking about backend
code, interfaces, libraries, frameworks) but OTOH I'm not a big known
open source framework developer either ;)
I would honestly be interested what the big framework/library players
actually want/need; do they prefer this implicit scalar type conversion
system or rather have a rigid system like the current object types but
for scalars too? I think decision on this RFC should include also
"their" saying too.
It's complex because we can't force anyone to participate but I think
above all these are the most important audience here because they know
what they want and they know what their users want. I say this because
usage of object types in PHP is almost non-existent (or, there are just
too few cases) compared to the architecture of some of the
framework/library systems out there.
Hmm.
thanks,
- Markus
Hi Markus,
I think it’d be weird to have different syntaxes for scalars and non-scalars. We already use the syntax the RFC proposes in the PHP manual, and I don’t think anyone’s confused by it.
I didn't meant to stay there's something wrong with the syntax, sorry if
my text was confusing! I was rather trying to point out that it does not
hint at anything but proactively tries to convert types; see below for more:
Ah, I see. Well, at least I was able to cover that syntax suggestion before someone else brings it up. I see your point.
And as you also pointed out: those conversion rules are different from
what a plain "(int)$whatever" would do.
The conversion rules are the same, it’s just that scalar hints reject certain values. The values they accept are converted the same. I don’t think this is unreasonable: implicit and explicit casts shouldn’t behave identically. Implicit casts need to be much stricter.
Now, going on step back here (talking about me), I'm speaking up because
my< needs are developer are different (mostly speaking about backend
code, interfaces, libraries, frameworks) but OTOH I'm not a big known
open source framework developer either ;)I would honestly be interested what the big framework/library players
actually want/need; do they prefer this implicit scalar type conversion
system or rather have a rigid system like the current object types but
for scalars too? I think decision on this RFC should include also
"their" saying too.It's complex because we can't force anyone to participate but I think
above all these are the most important audience here because they know
what they want and they know what their users want. I say this because
usage of object types in PHP is almost non-existent (or, there are just
too few cases) compared to the architecture of some of the
framework/library systems out there.
That’s a fair point. I’m not sure how they feel about it.
Their views aren’t necessarily the most important, though. Frameworks can do whatever they like, but what ultimately matters is what’s best for the end users, who don’t deal with the framework internals.
Thanks.
Andrea Faulds
http://ajf.me/
Now, going on step back here (talking about me), I'm speaking up because
my< needs are developer are different (mostly speaking about backend
code, interfaces, libraries, frameworks) but OTOH I'm not a big known
open source framework developer either ;)I would honestly be interested what the big framework/library players
actually want/need; do they prefer this implicit scalar type conversion
system or rather have a rigid system like the current object types but
for scalars too? I think decision on this RFC should include also
"their" saying too.It's complex because we can't force anyone to participate but I think
above all these are the most important audience here because they know
what they want and they know what their users want. I say this because
usage of object types in PHP is almost non-existent (or, there are just
too few cases) compared to the architecture of some of the
framework/library systems out there.
That’s a fair point. I’m not sure how they feel about it.Their views aren’t necessarily the most important, though. Frameworks can do whatever they like, but what ultimately matters is what’s best for the end users, who don’t deal with the framework internals.
Thanks.
Andrea Faulds
http://ajf.me/
The "end users" of php-src are "people who write PHP code". Those are
the end users that we should be concerned with. "People who visit web
sites" are their end users. Those people don't care in the slightest
what happens on this list; they care that the people writing PHP code
can do their job in a minimum amount of time and with a minimum amount
of bugs.
So asking developers of the major PHP frameworks and applications what
would help them do their job in a minimum amount of time with a minimum
amount of bugs is absolutely a worthwhile endeavor to figure out what
would be "best".
"User research" in this case, means talking to the lead developers of
Zend Framework, Symfony, Drupal, Wordpress, phpBB, and so on. I'd be
happy to make introduction for you on the FIG mailing list, which is the
best collection of such people I know of.
--Larry Garfield
The "end users" of php-src are "people who write PHP code". Those
are the end users that we should be concerned with. "People who
visit web sites" are their end users. Those people don't care in
the slightest what happens on this list; they care that the people
writing PHP code can do their job in a minimum amount of time and
with a minimum amount of bugs.So asking developers of the major PHP frameworks and applications
what would help them do their job in a minimum amount of time with a
minimum amount of bugs is absolutely a worthwhile endeavor to figure
out what would be "best".
I beleive that you are making a mistake by assuming that all our 'users' are the
same and have the same goals. There are:
a) those producing a web site, want to do so quickly, are not really concerned
about occasional type errors (maybe they should, but that is another story) and
for who current weak typing (inc type juggling) is just what they want. The
application prob had little real design but runs to their satisfaction, prob has
few comment or other documentation.
b) those who are producing a package/framework which takes sloppy input (from
forms, etc), validate it and thereafter variables should contain known types.
Also here are those who do like to remove all errors from even small applications.
They might not care if a value is int 42 or string "42", but want to be told (ie
error) if it is "42 fish" or "fourty two". They can then fix their validation.
They want to do the validation in one/few places, not have to put checks all over.
These users are willing to put in the effort to get it right, they have designed
it and will document it, etc.
There are many in between (a) & (b) in various combinations.
So, the needs/wants of (a) and (b) is very different.
IMHO part of the reason for the high contention is that people comment from the
perspective of either (a) or (b) and thus will not agree with someone from the
other camp.
Part of the solution is making type checking optional - as is the proposal.
But do try to remember that we do have different types of user.
Me ? I am in the (b) camp but do occasionally stray to (a).
There is a 3rd camp: (c)
These are the language implementors, in particular 3rd parties eg HipHop. These
do care if a variable is int 42 or string "42" because there are optimisations
to be made if they know that a value is really an int and not a string cleanly
convertable to an int.
--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
#include <std_disclaimer.h
Hi Alain,
There is a 3rd camp: (c)
These are the language implementors, in particular 3rd parties eg HipHop. These
do care if a variable is int 42 or string "42" because there are optimisations
to be made if they know that a value is really an int and not a string cleanly
convertable to an int.
Er, optimisations are not at all prevented by having implicit conversions. They still allow you to reason about types within a function, because types are converted.
Thanks.
Andrea Faulds
http://ajf.me/
Hey Sebastian,
Am 31.12.2014 um 21:27 schrieb Andrea Faulds:
Parameter type hints for PHP’s scalar types
Please use the term "type declaration for arguments" (or "type
declaration for parameters") instead of "type hints". If it's used
then it's not a hint.
You’re not the only one who feels this way, Levi Morison is also not too fond of our existing terminology. However, it is the name we’ve always used for these. It’s not the best name, but it is the name we’ve given them. This could be changed, but I don’t think that’s the job of this RFC. If someone else wants to push to change the name, go ahead.
Thanks!
Andrea Faulds
http://ajf.me/
Hi!
Please use the term "type declaration for arguments" (or "type
declaration for parameters") instead of "type hints". If it's used
then it's not a hint.
Thank you! I was just going to post this. We've made this mistake once,
but we don't have to perpetuate it. It's no "hinting", it's type
declaration, it's coercive typing, it is many things but "hinting" is a
misleading way to describe it.
--
Stas Malyshev
smalyshev@gmail.com
-----Original Message-----
From: Andrea Faulds [mailto:ajf@ajf.me]
Sent: Wednesday, December 31, 2014 10:28 PM
To: PHP Internals
Subject: [PHP-DEV] [RFC] Scalar Type HintsGood evening,
Parameter type hints for PHP’s scalar types are a long-requested feature
for
PHP. Today I am proposing an RFC which is a new attempt to add them to
the language. It is my hope that we can finally get this done for PHP 7.
Andrea,
I like this draft too, and that's a first after countless proposals over the
last decade - so kudos! :)
My main feedback here are the discrepancies between this RFC's casting rules
and PHP's current built-in casting rules. Ideally, I'd like those to be
completely identical and not almost-identical.
Since we're talking about v7.0, we do have the option of making changes to
PHP's fundamental casting rules where appropriate (e.g. converting an array
to a string).
But that said, I think the way strings->numbers are handled - where they
accept only numeric strings as it would mean you can't use casting to an
int/float as an ultra-simple way to sanitize untrusted input. I would
change the † section from:
†Non-numeric strings not accepted. Numeric strings with trailing characters
are accepted, but produce a notice.
to
† Numeric strings with trailing characters and non-numeric strings are
accepted, but produce a notice.
- and apply it to both this RFC and the infrastructure convert_to_*(), so
that it applies across the board in PHP.
Zeev
Hello,
I was expected an RFC like this in PHP for a while. I'm happy somebody made
one, thanks.
But something hit me in that. even if you can't give an objet, you can give
any scalar type that will be cast.
I'm not sure this behavior is very relevant.
Actually if I ask for a string, why the user could be specify an int
without getting error ?
Consider the following PHP 5.6 code:
function foo($a) {
if (!is_string($a)) {
throw new \Exception('You need to give a string');
}
}
Using your type hinting does not fix the problem of string checking because
the value will just be cast if it's an integer. Ofc if the user give an
object it's ok.
But why not throw a message error in any cases ?
I hope i'm clear enough.
Thanks for reading and have an happy new year :-) .
2015-01-01 13:41 GMT+01:00 Zeev Suraski zeev@zend.com:
-----Original Message-----
From: Andrea Faulds [mailto:ajf@ajf.me]
Sent: Wednesday, December 31, 2014 10:28 PM
To: PHP Internals
Subject: [PHP-DEV] [RFC] Scalar Type HintsGood evening,
Parameter type hints for PHP’s scalar types are a long-requested feature
for
PHP. Today I am proposing an RFC which is a new attempt to add them to
the language. It is my hope that we can finally get this done for PHP 7.Andrea,
I like this draft too, and that's a first after countless proposals over
the
last decade - so kudos! :)My main feedback here are the discrepancies between this RFC's casting
rules
and PHP's current built-in casting rules. Ideally, I'd like those to be
completely identical and not almost-identical.
Since we're talking about v7.0, we do have the option of making changes to
PHP's fundamental casting rules where appropriate (e.g. converting an array
to a string).
But that said, I think the way strings->numbers are handled - where they
accept only numeric strings as it would mean you can't use casting to an
int/float as an ultra-simple way to sanitize untrusted input. I would
change the † section from:†Non-numeric strings not accepted. Numeric strings with trailing characters
are accepted, but produce a notice.
to
† Numeric strings with trailing characters and non-numeric strings are
accepted, but produce a notice.
- and apply it to both this RFC and the infrastructure convert_to_*(), so
that it applies across the board in PHP.Zeev
Hi Maxime,
I was expected an RFC like this in PHP for a while. I'm happy somebody made one, thanks.
Glad to hear that.
But something hit me in that. even if you can't give an objet, you can give any scalar type that will be cast.
I'm not sure this behavior is very relevant.Actually if I ask for a string, why the user could be specify an int without getting error ?
Consider the following PHP 5.6 code:
function foo($a) {
if (!is_string($a)) {
throw new \Exception('You need to give a string');
}
}Using your type hinting does not fix the problem of string checking because the value will just be cast if it's an integer. Ofc if the user give an object it's ok.
But why not throw a message error in any cases ?
For various reasons, PHP has always been a weakly-typed language, so we allow conversions between scalar types when they make sense. An integer isn’t a string, sure, but it can be simply converted to one and the result makes sense, so I think that’s why we allow it. It would still error in some cases, like an object without __toString for a string parameter, or an array for an integer parameter.
Thanks.
Andrea Faulds
http://ajf.me/
Hey Zeev,
I like this draft too, and that's a first after countless proposals over the
last decade - so kudos! :)
Glad to hear that.
My main feedback here are the discrepancies between this RFC's casting rules
and PHP's current built-in casting rules. Ideally, I'd like those to be
completely identical and not almost-identical.
The RFC’s behaviour exactly matches that of zend_parse_parameters, with the exception of NULL
handling. In fact, the implementation is shared.
Whether zend_parse_parameters’s behaviour matches that of other implicit casts in PHP is another matter.
Since we're talking about v7.0, we do have the option of making changes to
PHP's fundamental casting rules where appropriate (e.g. converting an array
to a string).
But that said, I think the way strings->numbers are handled - where they
accept only numeric strings as it would mean you can't use casting to an
int/float as an ultra-simple way to sanitize untrusted input. I would
change the † section from:†Non-numeric strings not accepted. Numeric strings with trailing characters
are accepted, but produce a notice.
to
† Numeric strings with trailing characters and non-numeric strings are
accepted, but produce a notice.
- and apply it to both this RFC and the infrastructure convert_to_*(), so
that it applies across the board in PHP.
That could be changed, of course, but I don’t think it’s the job of this RFC to change our implicit casting/validation rules for functions. I would be willing to postpone this RFC’s vote until an RFC that makes some changes gets through, though.
That said, I don’t actually like the idea of changing this specific thing. “0”, “0.0” and possibly even “0 foobar” are reasonable candidates for an integer value, but I don’t think making “” or “foobar” be accepted as numbers makes much sense at all.
Thanks!
Andrea Faulds
http://ajf.me/
Am Thu, 1 Jan 2015 14:41:06 +0200
schrieb Zeev Suraski zeev@zend.com:
Hallo,
† Numeric strings with trailing characters and non-numeric strings are
accepted, but produce a notice.
Why Scalar Type Hints? What is the goal?
- Hints for IDEs - autocompletion?
- Find bugs with static program analysis?
- Find bugs at runtime?
- Help improving the quality of PHP applications
- with notices?
- with errors?
- with ...?
- ...?
Validation of HTML form input is strict. If I know that
$_POST['age'] contains a positive integer why should
foo(int $age) then able to accept float -123.42?
tschuess
[|8:)
Good evening,
Parameter type hints for PHP’s scalar types are a long-requested feature
for PHP. Today I am proposing an RFC which is a new attempt to add them to
the language. It is my hope that we can finally get this done for PHP 7.I’d like to thank Dmitry, who emailed me and gave me some feedback on the
draft RFC and some improvements to the patch. He encouraged me to put this
to internals sooner rather than later, as it’s a feature many people are
hoping will be in PHP 7.The new RFC can be found here: https://wiki.php.net/rfc/scalar_type_hints
As well as the RFC, there is a working Zend Engine III patch with tests,
and an incomplete specification patch.Please read the RFC (and specification patch, if you wish) and tell me
your thoughts.Thanks!
While in favor of introducing scalar type annotations, I'm against this
proposal in particular. I've held a different position in the past, but by
now I'm thoroughly convinced that if we introduce scalar type declarations,
they should be strict. No wiggling, no casting.
Apart from being consistent with the existing behavior of type declarations
and being what the majority of the vocal community wants, it is also
possible to reason about strict type declarations statically.
This means that an IDE or other tool will be able to perform meaningful
analysis based on typehinted functions. E.g. if you pass the result of a
string function to an int parameter, your code is definitely wrong and you
can be told so. Loose typehints as proposed here do not offer this
possibility, because a string can be or can not be a valid input to an int
parameter depending on the exact value.
For the same reason loose typehints are also more fragile. Code that worked
in casual testing during development will fail in production when
unexpected, improperly validated user input is encountered. With strict
types on the other hand it is very likely that code working with one input
will also work with all other possible inputs, because the type check is
not value-dependent. (Types are typically much less volatile than values.)
The ability to statically check type annotations is rather important to me.
I think much of the usefulness of this feature would be lost without the
ability to check correct usage with tooling. I'd also like to point out
that Hack uses a strict type scheme and it seems to work well there (though
I do acknowledge that the situation is not the same, as Hack has a
generally more powerful and fully statically checked type system).
Apart from these general thoughts, I also think that this proposal is a
regression from the previous one. In the name of "consistency" it uses the
rather weak zpp validation rules, which allow a lot of questionable input.
Just look at the conversion table in the RFC, practically all of it is
"Yes". I understand the motivation to reduce the number of different
conversion semantics, but I just can't get behind it if it means reusing
existing, bad conversion rules.
Nikita
Hey Nikita,
While in favor of introducing scalar type annotations, I'm against this proposal in particular. I've held a different position in the past, but by now I'm thoroughly convinced that if we introduce scalar type declarations, they should be strict. No wiggling, no casting.
Apart from being consistent with the existing behavior of type declarations and being what the majority of the vocal community wants, it is also possible to reason about strict type declarations statically.
This means that an IDE or other tool will be able to perform meaningful analysis based on typehinted functions. E.g. if you pass the result of a string function to an int parameter, your code is definitely wrong and you can be told so. Loose typehints as proposed here do not offer this possibility, because a string can be or can not be a valid input to an int parameter depending on the exact value.
For the same reason loose typehints are also more fragile. Code that worked in casual testing during development will fail in production when unexpected, improperly validated user input is encountered. With strict types on the other hand it is very likely that code working with one input will also work with all other possible inputs, because the type check is not value-dependent. (Types are typically much less volatile than values.)
The ability to statically check type annotations is rather important to me. I think much of the usefulness of this feature would be lost without the ability to check correct usage with tooling. I'd also like to point out that Hack uses a strict type scheme and it seems to work well there (though I do acknowledge that the situation is not the same, as Hack has a generally more powerful and fully statically checked type system).
Apart from these general thoughts, I also think that this proposal is a regression from the previous one. In the name of "consistency" it uses the rather weak zpp validation rules, which allow a lot of questionable input. Just look at the conversion table in the RFC, practically all of it is "Yes". I understand the motivation to reduce the number of different conversion semantics, but I just can't get behind it if it means reusing existing, bad conversion rules.
I would respond with something novel, but I notice now that your email is exactly the same as your comment on reddit, and there’s little point in responding to the same thing twice. So, for that reason, I’ll just reproduce my response on reddit, with some minor edits:
I'm likely -1 on this proposal. I've held a different position in the past, but by now I'm thoroughly convinced that if we introduce scalar typehints, they should be strict.
Why are you opposed to adding non-strict type hints? I realise that strict type hints are desirable for certain reasons, but they fit very poorly with the rest of the language and are unlikely to ever make it into PHP. So why, then, oppose the addition of non-strict hints? Surely these hints, which offer many (albeit not all) of the benefits of strict hints, are far better than nothing?
Apart from being consistent with the existing behavior of typehints
This RFC is consistent with the existing behaviour of extension functions (which match all userland type hints in behaviour). Besides, strict hints would be inconsistent with the rest of the language. PHP has never been strict for scalar types, and I think they'd make a rather awkward fit for that reason.
, it is also possible to reason about strict typehints statically.
This means that an IDE or other tool will be able to perform meaningful analysis based on typehinted functions. E.g. if you pass the result of a string function to an int parameter, your code is definitely wrong and you can be told so. Loose (casting) typehints do not offer this possibility, because a string can be or can not be a valid input to an int parameter depending on the exact value.
You have a point there. Unfortunately, casting hints prevent certain types of validation. However, some type combos always work and some always don't, so you can still error for certain cases (NULL, array, resource, or object where scalar expected).
While they do prevent certain types of validation, they don't prevent optimisations, so HHVM (and perhaps Zend in future) can still benefit from the type information.
For the same reason loose typehints are also more fragile. Code that worked in casual testing during development will fail in production when unexpected, improperly validated user input is encountered.
This is also true to an extent, but it is alleviated partly if code is fully type hinted. If your entire codebase has type hints, such errors can be caught early before they're passed to other functions. Once values are converted to integers, they stay integers (unless you pass it to something taking a string).
Thanks.
Andrea Faulds
http://ajf.me/
The battle between strict type declarations vs coercive has been here for a
while. My problem with coercion in detriment of strictness is
that sometimes you DON'T WANT TYPE CASTING AT ALL. This new feature would
create serious impediments. So I wonder if we couldn't have both (strict
and coercive types declarations) and leave the current proposed type
hinting syntax reserved for strict type declarations? Like in:
function(int $a, (int) $b) {
// $a will be strict
// $b will be type casted
}
This way we can actually choose when to cast and when to be strict and both
features could be voted independently without affect each other possible
future adoptions.
genius and simple syntax!
int $a === assertInt($a)
(int)$a === (int)$a
2015-01-01 18:59 GMT+03:00 Marcio Almada marcio.web2@gmail.com:
The battle between strict type declarations vs coercive has been here for a
while. My problem with coercion in detriment of strictness is
that sometimes you DON'T WANT TYPE CASTING AT ALL. This new feature would
create serious impediments. So I wonder if we couldn't have both (strict
and coercive types declarations) and leave the current proposed type
hinting syntax reserved for strict type declarations? Like in:function(int $a, (int) $b) { // $a will be strict // $b will be type casted }
This way we can actually choose when to cast and when to be strict and both
features could be voted independently without affect each other possible
future adoptions.
--
With regards, Alexander Moskalev
irker@irker.net
irker@php.net
a.moskalev@corp.badoo.com
Woow. The compromise proposed by Marcio looks awesome to me :) .
Andrea, I can understand that in the PHP logic, but IMO if the RFC does not
avoid people that use code like I present before i don't think it's
completely relevant.
I mean, if people still use old tricks because the type hinting doesn't
allow to make what they wanted, then things are not cool enough and should
be re-think. And actually i say that because I see easily me still using
old trick without type hinting... Becaue of this cast story.
Also notice that Marcio's point could be another way, like adding something
with @integer (random operator choice, do not keep that in mind) that make
the check strict... Then IMO this operator will be so much used that it
will not make sence that is not the default behavior.
That's why the proposal of Marcio is cool to me.
2015-01-01 16:59 GMT+01:00 Marcio Almada marcio.web2@gmail.com:
The battle between strict type declarations vs coercive has been here for a
while. My problem with coercion in detriment of strictness is
that sometimes you DON'T WANT TYPE CASTING AT ALL. This new feature would
create serious impediments. So I wonder if we couldn't have both (strict
and coercive types declarations) and leave the current proposed type
hinting syntax reserved for strict type declarations? Like in:function(int $a, (int) $b) { // $a will be strict // $b will be type casted }
This way we can actually choose when to cast and when to be strict and both
features could be voted independently without affect each other possible
future adoptions.
The battle between strict type declarations vs coercive has been here for a
while. My problem with coercion in detriment of strictness is
that sometimes you DON'T WANT TYPE CASTING AT ALL. This new feature would
create serious impediments. So I wonder if we couldn't have both (strict
and coercive types declarations) and leave the current proposed type
hinting syntax reserved for strict type declarations? Like in:function(int $a, (int) $b) { // $a will be strict // $b will be type casted }
This way we can actually choose when to cast and when to be strict and both
features could be voted independently without affect each other possible
future adoptions.
for the record this was proposed (as an idea) previously with the exact
same syntax(the earlies mention I could find in my mailbox was from Derick
from 2009: http://comments.gmane.org/gmane.comp.php.devel/57653 but I
remember seeing it every time when this discussion happens).
If somebody is interested in previous attempts/discussion on the topic,
here are some links for starters:
http://nikic.github.io/2012/03/06/Scalar-type-hinting-is-harder-than-you-think.html
http://blog.ircmaxell.com/2012/03/parameter-type-casting-in-php.html
https://wiki.php.net/rfc/parameter_type_casting_hints
https://wiki.php.net/rfc/typechecking
--
Ferenc Kovács
@Tyr43l - http://tyrael.hu
for the record this was proposed (as an idea) previously with the exact
same syntax(the earlies mention I could find in my mailbox was > from
Derick from 2009: http://comments.gmane.org/gmane.comp.php.devel/57653 but
I remember seeing it every time when this
discussion happens).
If somebody is interested in previous attempts/discussion on the topic,
here are some links for starters:
http://nikic.github.io/2012/03/06/Scalar-type-hinting-is-harder-than-you-think.html,
http://blog.ircmaxell.com/2012/03/parameter-type-casting-in-php.html,
https://wiki.php.net/rfc/parameter_type_casting_hints,
https://wiki.php.net/rfc/typechecking
Ferenc Kovacs, thanks for referencing it up! I wasn't aware this proposal
was so recurrent. Anyway, this seems to be the most interesting solution we
could have now without compromise against the long awaited strict type
check in the future. One big plus is that "parameter type casting" wouldn't
offer any bc break and therefore is much easier to pass.
Backed up by all the references Ferenc Kovacs listed, I'd like to
suggest Andrea
Faulds to use parameter type casting syntax for this RFC.
Regards Márcio Almada
https://github.com/marcioAlmada
2015-01-01 13:36 GMT-03:00 Ferenc Kovacs tyra3l@gmail.com:
On Thu, Jan 1, 2015 at 4:59 PM, Marcio Almada marcio.web2@gmail.com
wrote:The battle between strict type declarations vs coercive has been here for
a
while. My problem with coercion in detriment of strictness is
that sometimes you DON'T WANT TYPE CASTING AT ALL. This new feature would
create serious impediments. So I wonder if we couldn't have both (strict
and coercive types declarations) and leave the current proposed type
hinting syntax reserved for strict type declarations? Like in:function(int $a, (int) $b) { // $a will be strict // $b will be type casted }
This way we can actually choose when to cast and when to be strict and
both
features could be voted independently without affect each other possible
future adoptions.for the record this was proposed (as an idea) previously with the exact
same syntax(the earlies mention I could find in my mailbox was from Derick
from 2009: http://comments.gmane.org/gmane.comp.php.devel/57653 but I
remember seeing it every time when this discussion happens).
If somebody is interested in previous attempts/discussion on the topic,
here are some links for starters:http://nikic.github.io/2012/03/06/Scalar-type-hinting-is-harder-than-you-think.html
http://blog.ircmaxell.com/2012/03/parameter-type-casting-in-php.html
https://wiki.php.net/rfc/parameter_type_casting_hints
https://wiki.php.net/rfc/typechecking--
Ferenc Kovács
@Tyr43l - http://tyrael.hu
I think it is no problem to add strict parameter type hints with another rfc (if this rfc gets accepted), e.g. function foobar(string! $str, int! $str){} or any other syntax.
Regards
Thomas
Nikita Popov wrote on 01.01.2015 15:05:
Good evening,
Parameter type hints for PHP’s scalar types are a long-requested feature
for PHP. Today I am proposing an RFC which is a new attempt to add them to
the language. It is my hope that we can finally get this done for PHP 7.I’d like to thank Dmitry, who emailed me and gave me some feedback on the
draft RFC and some improvements to the patch. He encouraged me to put this
to internals sooner rather than later, as it’s a feature many people are
hoping will be in PHP 7.The new RFC can be found here: https://wiki.php.net/rfc/scalar_type_hints
As well as the RFC, there is a working Zend Engine III patch with tests,
and an incomplete specification patch.Please read the RFC (and specification patch, if you wish) and tell me
your thoughts.Thanks!
While in favor of introducing scalar type annotations, I'm against this
proposal in particular. I've held a different position in the past, but by
now I'm thoroughly convinced that if we introduce scalar type declarations,
they should be strict. No wiggling, no casting.Apart from being consistent with the existing behavior of type declarations
and being what the majority of the vocal community wants, it is also
possible to reason about strict type declarations statically.This means that an IDE or other tool will be able to perform meaningful
analysis based on typehinted functions. E.g. if you pass the result of a
string function to an int parameter, your code is definitely wrong and you
can be told so. Loose typehints as proposed here do not offer this
possibility, because a string can be or can not be a valid input to an int
parameter depending on the exact value.For the same reason loose typehints are also more fragile. Code that worked
in casual testing during development will fail in production when
unexpected, improperly validated user input is encountered. With strict
types on the other hand it is very likely that code working with one input
will also work with all other possible inputs, because the type check is
not value-dependent. (Types are typically much less volatile than values.)The ability to statically check type annotations is rather important to me.
I think much of the usefulness of this feature would be lost without the
ability to check correct usage with tooling. I'd also like to point out
that Hack uses a strict type scheme and it seems to work well there (though
I do acknowledge that the situation is not the same, as Hack has a
generally more powerful and fully statically checked type system).Apart from these general thoughts, I also think that this proposal is a
regression from the previous one. In the name of "consistency" it uses the
rather weak zpp validation rules, which allow a lot of questionable input.
Just look at the conversion table in the RFC, practically all of it is
"Yes". I understand the motivation to reduce the number of different
conversion semantics, but I just can't get behind it if it means reusing
existing, bad conversion rules.Nikita
I think it is no problem to add strict parameter type hints with another
rfc (if this rfc gets accepted), e.g. function foobar(string! $str, int!
$str){} or any other syntax.
The problem is that the current proposed hints/casts are deviating from the
type-hints that we are used to, therefore this particular feature should
(eventually) have an alternate syntax, whereas the strict hints would just
use int
, integer
, string
, float
, bool
, boolan
without
additional chars (assuming that we will get strict hints somewhen).
This will also reduce clashes with the current HHVM implementation.
Marco Pivetta
Hey Marco,
The problem is that the current proposed hints/casts are deviating from the type-hints that we are used to, therefore this particular feature should (eventually) have an alternate syntax, whereas the strict hints would just use
int
,integer
,string
,float
,bool
,boolan
without additional chars (assuming that we will get strict hints somewhen).
I wouldn’t say it’s really such a great deviation. We use the same syntax for scalars and non-scalars in the manual.
This will also reduce clashes with the current HHVM implementation.
HHVM doesn’t have scalar type hints for PHP. Hack does, but Hack isn’t PHP.
Thanks.
Andrea Faulds
http://ajf.me/
The problem is that the current proposed hints/casts are deviating from the
type-hints that we are used to
I don't think type hints we currently have are different. The only difference we have is that there is no automatic casting for objects/arrays to scalars and scalars to objects/arrays (I exclude __toString() and $a=[]; echo $a;).
Regards
Thomas
Marco Pivetta wrote on 01.01.2015 17:56:
I think it is no problem to add strict parameter type hints with another
rfc (if this rfc gets accepted), e.g. function foobar(string! $str, int!
$str){} or any other syntax.The problem is that the current proposed hints/casts are deviating from the
type-hints that we are used to, therefore this particular feature should
(eventually) have an alternate syntax, whereas the strict hints would just
useint
,integer
,string
,float
,bool
,boolan
without
additional chars (assuming that we will get strict hints somewhen).This will also reduce clashes with the current HHVM implementation.
Marco Pivetta
Hi!
The problem is that the current proposed hints/casts are deviating from the
type-hints that we are used to, therefore this particular feature should
Let's check the manual we're used to.
http://php.net/manual/en/function.substr.php
string substr ( string $string , int $start [, int $length ] )
What "string" and "int" mean there? How they work? What we're
"deviating" from?
Stas Malyshev
smalyshev@gmail.com
Hi!
The problem is that the current proposed hints/casts are deviating from
the
type-hints that we are used to, therefore this particular feature shouldLet's check the manual we're used to.
http://php.net/manual/en/function.substr.phpstring substr ( string $string , int $start [, int $length ] )
What "string" and "int" mean there? How they work? What we're
"deviating" from?
I'm not sure why everyone is still taking the PHP manual as a good
reference about how to write software: PHP internal functions are one of
the main reason why this language is under-appreciated.
The manual is pulling the concepts of int
, string
and so on out of thin
air, whereas the correct syntax in those cases is int|string|Stringable
,
with explicit explanation of what those strings should look like.
This is what you currently do in a real-world scenario (due to the lack of
hints for internal types):
class Shipment
{
public function __construct(ProductId $productId, $amount)
{
if (! is_int($amount)) {
throw new InvalidArgumentException(sprintf('Provided $amount
must be integer, %s given', gettype($amount)));
}
$this->productId = $productId;
$this->amount = $amount;
}
}
No allowance of weird values passed in: why would my software ever pass an
invalid $amount to my constructor? That's where I'd put a hard-failing
assertion instead.
This is what I'd like it to be:
class Shipment
{
public function __construct(ProductId $productId, int $amount)
{
$this->productId = $productId;
$this->amount = $amount;
}
}
Following code MUST cause a hard failure:
new Shipment($bananasId, '1 of a whole lot');
This is constraining. Constraining has nothing to do with validation and
casting: mixing the concepts of type-juggling, validation and constraining
is a huge mess (which I don't like, but it's better than having nothing),
and it would better be off using a syntax like:
class Shipment
{
public function __construct(ProductId $productId, (int) $amount)
{
$this->productId = $productId;
$this->amount = $amount;
}
}
This makes the difference much more clear, as that (int)
is not a
constraint, it's a different, broader concept.
I'd rather have the new behavior suggested by Andrea with a syntax that
makes this subtle yet gigantic difference explicit.
Additionally, the BC break concern of strict type-hinting and classes named
String
, Int
and Bool
(and similars) is delayed until we get strict
type-hints, as the syntax is currently not allowed by the language and
doesn't present any BC issues (http://3v4l.org/3Fqdh):
function sum((float) $a, (float) $b) { }
From an implementation perspective, it should just be a parser change.
@Andrea: as for the "strict" and "non-strict" PHP suggestion you had
before, please don't do that. Take following example:
function repeat(int $amount, (string) $value) {
$acc = '';
$i = 0;
while ($i < $amount) {
$i += 1;
$acc .= $value;
}
return $acc;
}
As you can see, mixing implicit cast and strict constraining behaviors is
perfectly fine in this case, so please don't include contextual switches:
that would be even worse IMO.
Marco Pivetta
Hi Marco,
I'm not sure why everyone is still taking the PHP manual as a good reference about how to write software: PHP internal functions are one of the main reason why this language is under-appreciated.
The manual is pulling the concepts of
int
,string
and so on out of thin air, whereas the correct syntax in those cases isint|string|Stringable
, with explicit explanation of what those strings should look like.
I don’t see why the manual is wrong. Yes, in a strictly-typed language which allows no conversion of arguments, foobar(int $foo) wouldn’t have the behaviour PHP exhibits. Yet PHP is not a strictly-typed language, and weakly-typed parameters are hardly a novel concept. The language that PHP is implemented in, C, also has this. And yet, C does not have this:
void foobar(char|unsigned char|short|unsigned short|int|unsigned int|long|unsigned long|long long|unsigned long long|float|double|_Bool|void* foo)
Why? Because in C, implicit conversions between parameter types are permitted. PHP has the same thing for its internal/extension functions. The manual isn’t wrong.
This is constraining. Constraining has nothing to do with validation and casting: mixing the concepts of type-juggling, validation and constraining is a huge mess (which I don't like, but it's better than having nothing), and it would better be off using a syntax like:
Argument types do not necessarily exist purely to error on invalid input. They also exist for documentation purposes and, in languages like C, implicit conversion.
public function __construct(ProductId $productId, (int) $amount)
This makes the difference much more clear, as that
(int)
is not a constraint, it's a different, broader concept.
I don’t think the cast-like syntax is a particularly good idea. It’s inconsistent with our manual conventions (then again, many other things are). It’s misleading, as well: we don’t do an explicit cast. If it was an explicit cast, literally any value would be accepted. But that’s not the case at all, the weakly-typed parameters that extension functions have do not accept any value. Instead, they accept the desired type, and a limited range of convertible values of other scalar types.
Additionally, the BC break concern of strict type-hinting and classes named
String
,Int
andBool
(and similars) is delayed until we get strict type-hints, as the syntax is currently not allowed by the language and doesn't present any BC issues (http://3v4l.org/3Fqdh):
I’d rather not delay it. We probably should have reserved syntax for scalar hints ages ago.
@Andrea: as for the "strict" and "non-strict" PHP suggestion you had before, please don't do that. Take following example:
function repeat(int $amount, (string) $value) {
$acc = '';
$i = 0;while ($i < $amount) { $i += 1; $acc .= $value; } return $acc;
}
As you can see, mixing implicit cast and strict constraining behaviors is perfectly fine in this case, so please don't include contextual switches: that would be even worse IMO.
I don’t understand why that particular example makes sense. Since it’s producing a string value, surely $value should always be a string? I really don’t like the idea of mixing strong- and weakly-typed parameters. We should be consistent. Otherwise, we are imposing too high a mental burden on programmers, who will now need to remember which parameters are strongly-typed and which parameters are weakly-typed.
Thanks.
--
Andrea Faulds
http://ajf.me/
Hi Marco,
I'm not sure why everyone is still taking the PHP manual as a good
reference about how to write software: PHP internal functions are one of
the main reason why this language is under-appreciated.The manual is pulling the concepts of
int
,string
and so on out of
thin air, whereas the correct syntax in those cases is
int|string|Stringable
, with explicit explanation of what those strings
should look like.I don’t see why the manual is wrong. Yes, in a strictly-typed language
which allows no conversion of arguments, foobar(int $foo) wouldn’t have the
behaviour PHP exhibits. Yet PHP is not a strictly-typed language, and
weakly-typed parameters are hardly a novel concept. The language that PHP
is implemented in, C, also has this. And yet, C does not have this:
void foobar(char|unsigned char|short|unsigned short|int|unsigned
int|long|unsigned long|long long|unsigned long
long|float|double|_Bool|void* foo)Why? Because in C, implicit conversions between parameter types are
permitted. PHP has the same thing for its internal/extension functions. The
manual isn’t wrong.
The manual is wrong since it specifies a strict hint for something that is
mixed
. It is still useful tho, since it's telling us "it accepts"
integer-ish values there. It's purely for documentation purposes though, it
is by far dictating the actual implementation.
This is constraining. Constraining has nothing to do with validation and
casting: mixing the concepts of type-juggling, validation and constraining
is a huge mess (which I don't like, but it's better than having nothing),
and it would better be off using a syntax like:Argument types do not necessarily exist purely to error on invalid input.
They also exist for documentation purposes and, in languages like C,
implicit conversion.
No, argument types exist to prevent mistakes: they prevent invalid values
to cross validation boundaries of the application. Documentation purposes
are purely secondary, we already have phpdoc for that.
public function __construct(ProductId $productId, (int) $amount)
This makes the difference much more clear, as that
(int)
is not a
constraint, it's a different, broader concept.I don’t think the cast-like syntax is a particularly good idea. It’s
inconsistent with our manual conventions (then again, many other things
are).
Again with the manual (sigh): the manual comes AFTER the code has been
written.
It’s misleading, as well: we don’t do an explicit cast. If it was an
explicit cast, literally any value would be accepted.
Agree on that, then give it a different name and/or syntax, but it's not a
constraint then.
But that’s not the case at all, the weakly-typed parameters that extension
functions have do not accept any value. Instead, they accept the desired
type, and a limited range of convertible values of other scalar types.
~int ~float and ~string are fine as well here IMO, if you think that (int)
(float) and (string) are misleading.
Additionally, the BC break concern of strict type-hinting and classes
namedString
,Int
andBool
(and similars) is delayed until we get
strict type-hints, as the syntax is currently not allowed by the language
and doesn't present any BC issues (http://3v4l.org/3Fqdh):I’d rather not delay it. We probably should have reserved syntax for
scalar hints ages ago.
It was just a plus for getting it done to move over to actual type
specifications :-) Introducing a BC break always increases the likeliness
of a change being accepted by a huge lot.
@Andrea: as for the "strict" and "non-strict" PHP suggestion you had
before, please don't do that. Take following example:function repeat(int $amount, (string) $value) {
$acc = '';
$i = 0;while ($i < $amount) { $i += 1; $acc .= $value; } return $acc;
}
As you can see, mixing implicit cast and strict constraining behaviors
is perfectly fine in this case, so please don't include contextual
switches: that would be even worse IMO.I don’t understand why that particular example makes sense. Since it’s
producing a string value, surely $value should always be a string?
The difference is that $amount must always be an integer (not integer-ish)
value, whereas $value must be a stringable value, and the cast would happen
at call-time, not at every loop (very relevant for instances of classes
implementing __toString()
, as the call happens only once).
I really don’t like the idea of mixing strong- and weakly-typed
parameters. We should be consistent. Otherwise, we are imposing too high a
mental burden on programmers, who will now need to remember which
parameters are strongly-typed and which parameters are weakly-typed.
I think the example I just gave you is very consistent, explicit and easy
to understand. Additionally, I don't see any particular mental burden
except for having to know that $value will be cast to a string if it isn't.
If there is any mental burden, it's mainly introduced by the proposed RFC,
whereas strict checking would remove any doubts about what $value can be.
Marco Pivetta
The manual is wrong since it specifies a strict hint for something that is
mixed
. It is still useful tho, since it's telling us "it accepts" integer-ish values there. It's purely for documentation purposes though, it is by far dictating the actual implementation.
It doesn’t specify a strict hint at all. If this was Java, that would be a “strict” hint, perhaps, but this isn’t Java. Again, the C programming language has weak parameter types.
No, argument types exist to prevent mistakes: they prevent invalid values to cross validation boundaries of the application. Documentation purposes are purely secondary, we already have phpdoc for that.
Again, this is not true in all languages. Simply because this is the purpose of parameter types in one language does not mean that it is the purpose of them in all languages.
Again with the manual (sigh): the manual comes AFTER the code has been written.
So? I don’t see how that changes anything. The PHP manual’s conventions are well established and familiar to all PHP programmers.
Yes, it’s a manual, it’s documentation. But it has a well-established syntax for parameter types. Why should we deviate from it? Why wouldn’t new users be confused that the syntax used in the manual does something completely different for userland PHP code?
The difference is that $amount must always be an integer (not integer-ish) value, whereas $value must be a stringable value, and the cast would happen at call-time, not at every loop (very relevant for instances of classes implementing
__toString()
, as the call happens only once).
Why must $amount, before entering the body of the function, be an integer? An integer that was converted from a float or a string would work equally well. Surely, in this case, you should make both arguments strict, for consistency?
I think the example I just gave you is very consistent, explicit and easy to understand.
I don’t. I can’t understand why it’s fine to cast the second parameter, yet not the first. Why wouldn’t the following work? Let’s assume it follows the RFC, so int and string are weak type hints.
function repeat(int $amount, string $value) {
$acc = '';
$i = 0;
while ($i < $amount) {
$i += 1;
$acc .= $value;
}
return $acc;
}
What’s wrong with this function now? I don’t understand why $value should be weakly-typed and $amount shouldn’t be. Why should there be inconsistency here?
If there is any mental burden, it's mainly introduced by the proposed RFC, whereas strict checking would remove any doubts about what $value can be.
But you’re proposing to have this RFC’s behaviour and have the strict behaviour. How is that less of a mental burden?
Thanks.
Andrea Faulds
http://ajf.me/
I think it is no problem to add strict parameter type hints with another rfc (if this rfc gets accepted), e.g. function foobar(string! $str, int! $str){} or any other syntax.
I would rather have it the other way around. string $str
is strict
and some other syntax (notably (string) $str
or @string $str
) can
be loose.
Hey Levi,
I think it is no problem to add strict parameter type hints with another rfc (if this rfc gets accepted), e.g. function foobar(string! $str, int! $str){} or any other syntax.
I would rather have it the other way around.
string $str
is strict
and some other syntax (notably(string) $str
or@string $str
) can
be loose.
Rather than having it on a function-by-function basis, meaning different APIs would have different rules, I am warming to the idea of having a per-file strict mode. That is, if you used something like “use strict;” at the top of your code, functions you called would have strict parameter type checking. But if you didn’t, they’d be loose. This way all APIs would be consistent, but you’d have the choice of strictness or non-strictness as you see fit.
This would be similar to Hack’s Strict and Decl modes.
I’m not necessarily saying it’s the best idea, but it’s certainly an option. The syntax is already reserved! ;)
Thanks.
Andrea Faulds
http://ajf.me/
when looking into phpdoc, e.g. http://de.php.net/substr we have string and int everywhere in function definitions, do you want to change the whole documentation?
Regards
Thomas
Levi Morrison wrote on 01.01.2015 18:19:
I think it is no problem to add strict parameter type hints with another rfc
(if this rfc gets accepted), e.g. function foobar(string! $str, int! $str){}
or any other syntax.I would rather have it the other way around.
string $str
is strict
and some other syntax (notably(string) $str
or@string $str
) can
be loose.
While in favor of introducing scalar type annotations, I'm against this
proposal in particular. I've held a different position in the past, but by
now I'm thoroughly convinced that if we introduce scalar type declarations,
they should be strict. No wiggling, no casting.
I agree. This is the same sentiment I have expressed in previous
discussions, and I'm sticking with it.
+1 for "scalar type annotations"
-2 for the implicit type conversion.
Hi!
Apart from being consistent with the existing behavior of type declarations
We have no existing behavior of scalar type declarations except for
hidden ones in internal functions and object types behave in PHP
completely differently from scalars, so there's no place for
"consistency" claim here. Scalars always has been convertable in PHP,
internal functions accepting scalars always behaved in a coercive and
not strict way.
and being what the majority of the vocal community wants, it is also
I have no idea what "vocal community" is, but what exactly you're basing
your claim of majority on?
possible to reason about strict type declarations statically.
You can reason about coercive declaration as well as you can reason
about strict one.
possibility, because a string can be or can not be a valid input to an int
parameter depending on the exact value.
And the IDE can tell you exactly that.
For the same reason loose typehints are also more fragile. Code that worked
in casual testing during development will fail in production when
unexpected, improperly validated user input is encountered. With strict
types on the other hand it is very likely that code working with one input
will also work with all other possible inputs, because the type check is
Since all the inputs are ultimately strings for PHP, so you'd have to
convert from string to other types somewhere, and if that code does not
perform properly on certain input, you still get a failure. You are just
moving the failure around. And of course the claim "if the code works
correctly with one integer input it would work correctly with any other
integer input" is wildly incorrect. In fact, type errors (if you define
types broadly as string/integer) are a tiny minority of logic errors in
software.
ability to check correct usage with tooling. I'd also like to point out
that Hack uses a strict type scheme and it seems to work well there (though
Java uses strict typing too. And Perl has no typing whatsoever. So what?
How bringing an example of one language that does what you want is an
argument for PHP to do that? There are examples of languages doing
practically anything imaginable.
Just look at the conversion table in the RFC, practically all of it is
"Yes". I understand the motivation to reduce the number of different
How many "yes" is a bad thing? It's not a competition of which approach
can produce the most error messages. It's about which one is the most
productive to work with. You want PHP to be strictly typed, but PHP is
not that language. If you want to make PHP that language, IMHO it's
misguided but doing it by introducing inconsistencies via new syntax
that behaves different from existing analogous parts of the language -
like having typed parameter behave differently in internal and
user-defined function - is both misguided and wrong.
Stas Malyshev
smalyshev@gmail.com
ability to check correct usage with tooling. I'd also like to point out
that Hack uses a strict type scheme and it seems to work well there (thoughJava uses strict typing too. And Perl has no typing whatsoever. So what?
How bringing an example of one language that does what you want is an
argument for PHP to do that? There are examples of languages doing
practically anything imaginable.
To add to what Stas says: C, in fact, has weak parameter typing. A function taking a long long can be passed a char, or a double, or even a pointer. Now, there’s a good case to be made that C’s weak parameter typing is bad. But that’s mostly because C will let you shoot yourself in the foot and overflow bounds, corrupt memory or cause a segfault. PHP doesn’t have these issues though, so that argument doesn’t really apply.
By the way, “weak parameter typing” sounds like a good name for this. Makes more sense than “non-strict” or “casting”.
--
Andrea Faulds
http://ajf.me/
On Fri, Jan 2, 2015 at 12:58 AM, Stanislav Malyshev smalyshev@gmail.com
wrote:
Hi!
Apart from being consistent with the existing behavior of type
declarationsWe have no existing behavior of scalar type declarations except for
hidden ones in internal functions and object types behave in PHP
completely differently from scalars, so there's no place for
"consistency" claim here. Scalars always has been convertable in PHP,
internal functions accepting scalars always behaved in a coercive and
not strict way.
I'm referring to consistency with existing non-scalar type declarations
supported by userland functions. They are strict.
Furthermore it should be pointed out that scalars aren't much more
"convertable" than other types in PHP. For example it is possible to cast
pretty much any value into an array (including scalars), yet nobody is
arguing that "array" typed parameters should accept integers as input. PHP
in general is very liberal with type conversions, this is not specific to
scalars.
and being what the majority of the vocal community wants, it is also
I have no idea what "vocal community" is, but what exactly you're basing
your claim of majority on?
By "vocal community" I'm referring to people who state an opinion on this
topic in a non-internals discussion. The scalar typehinting proposal has
come up many times over the years in many minor variations and every time
discussions on reddit etc are dominated by the preference of having strict
type declarations. In various OTR conversations this was also by far the
most common opinion I heard.
The "vocal" was explicitly mentioned because I realize that this sample is
subject to selection bias.
possible to reason about strict type declarations statically.
You can reason about coercive declaration as well as you can reason
about strict one.
I am referring to weak typehints as described by this particular proposal.
Yes, you can of course perform static analysis even in the presence of
implicit conversions. However the problem with this proposal, and all other
similar proposals I have seen, is that the type check is value
dependent. E.g. saying "integers and floats are accepted for float
parameters" is okay from a static analysis perspective, whereas "strings
are only accepted for int parameters if they contain decimal integers" is
not.
Additionally this particular proposal at least is loose to the point of
making static analysis pointless even in the cases where it can be done.
E.g. if even something obviously wrong like passing "foobar" to a boolean
argument cannot be detected as incorrect code (because it would be accepted
by PHP), then I don't think doing type analysis would have any value.
For the same reason loose typehints are also more fragile. Code that
worked
in casual testing during development will fail in production when
unexpected, improperly validated user input is encountered. With strict
types on the other hand it is very likely that code working with one
input
will also work with all other possible inputs, because the type check isSince all the inputs are ultimately strings for PHP, so you'd have to
convert from string to other types somewhere, and if that code does not
perform properly on certain input, you still get a failure. You are just
moving the failure around. And of course the claim "if the code works
correctly with one integer input it would work correctly with any other
integer input" is wildly incorrect. In fact, type errors (if you define
types broadly as string/integer) are a tiny minority of logic errors in
software.
I was probably not sufficiently clear here. Strict type declarations force
you to explicitly validate&cast any input you receive from the user in
stringified form. Because of this you will always provide a valid type to
the function, irregardless of what the user entered. With weak typehints on
the other hand you will get away with working directly on unvalidated input
(like passing $_GET['id'] directly to an int-hinted parameter). Usually.
However when the script is called with ?id=abc you will get a runtime
error. That is what I'm referring to when I say "more fragile".
ability to check correct usage with tooling. I'd also like to point out
that Hack uses a strict type scheme and it seems to work well there
(thoughJava uses strict typing too. And Perl has no typing whatsoever. So what?
How bringing an example of one language that does what you want is an
argument for PHP to do that? There are examples of languages doing
practically anything imaginable.
Hack is very similar to PHP. If something works for Hack it is likely to
also work well for PHP.
Be realistic. I'm not trying to sell you algebraic data types because
they're nice in Haskell, and I'm not trying to introduce Lisp macros
either. I'm talking about a language that is derived from PHP. I think
Hack is a nice playground for new PHP features and we should be
incorporating the parts that turned out to be useful.
Just look at the conversion table in the RFC, practically all of it is
"Yes". I understand the motivation to reduce the number of differentHow many "yes" is a bad thing?
Yes. There's a tradeoff between convenience of usage and how many bugs the
type checking catches. For maximum convenience we can just allow all types
for all hints (there was one proposal to do this - effectively this would
make the typehints documentation only). For maximum correctness we allow
only the specified type. I prefer correctness over convenience, but I
understand that others have different opinions on this.
Nikita
Hi!
I'm referring to consistency with existing non-scalar type declarations
supported by userland functions. They are strict.
There's no choice here as object types are not convertable in principle
(well, not unless we do convert ctors like C++ but IMHO that would be a
bit crazy). You can't equate primitive types in PHP with object types -
they just work differently and there can not be "consistency" between
them unless you rewrite PHP to be strongly typed language with no
implicit type conversion. It's not unheard of - Python is like that -
but it's not what PHP is now.
Furthermore it should be pointed out that scalars aren't much more
"convertable" than other types in PHP. For example it is possible to
That's not true - you can convert int to string, string to int, int to
bool, float to bool, etc. Most of these work implicitly - e.g. you can
do if($integer) and integer becomes bool. You can do "foo".$integer. You
can not have $integer->fileName() and have integer become some object
that has filename.
cast pretty much any value into an array (including scalars), yet nobody
is arguing that "array" typed parameters should accept integers as
That's because array is not a scalar type.
See
https://github.com/php/php-langspec/blob/master/spec/05-types.md#scalar-types
Try is_scalar([]).
Array is a collection type. Its behavior is different from scalars and
is more like objects (in fact, if it weren't for BC, it should have been
behaving exactly like objects and be just an object type, but that ship
has sailed).
Also, there's no implicit conversion from scalars to arrays. There's an
explicit one but that's different because explicit conversion is always
more powerful since it can assume you know what you're doing if you
asked to convert, so it should try harder.
input. PHP in general is very liberal with type conversions, this is not
specific to scalars.
It is specific to scalars - PHP is very liberal with implicit type
conversions of scalar types, but the only conversion you can do between
non-scalar types are degraded ones like object->array and
array->stdClass but even those are only for explicit cast - there are no
implicit casts between non-scalar types at all. That makes sense because
the semantic of non-scalar types leaves no place for implicit cast - you
can not just conjure a SplFileInfo out of object of another class, it's
too complex for that. There's a principal difference between scalars and
complex types in this regard.
By "vocal community" I'm referring to people who state an opinion on
this topic in a non-internals discussion. The scalar typehinting
Conveniently, there's no way to verify this claim. Not that random
assembly of reddit commenters is a representative group of anything but
a random assembly of reddit commenters.
of having strict type declarations. In various OTR conversations this
was also by far the most common opinion I heard.
So, you talk to people that think like you and share your opinion. That
must mean you're right or they are a majority of all PHP community?
proposal, and all other similar proposals I have seen, is that the
type check is value dependent. E.g. saying "integers and floats are
accepted for float parameters" is okay from a static analysis
perspective, whereas "strings are only accepted for int parameters if
they contain decimal integers" is not.
That depends on the purpose of your analysis. If your analysis is to
ensure the code never fails, type analysis would achieve very little.
What use would it be to know something is integer if you're taking
square root of it? You'd also need to know if it's not negative. And if
you're talking about bank balance withdrawal you'd also need to know
it's less than the current balance. And so on. You can not prove the
code works just by looking at types, especially shallow types like PHP has.
If your analysis, however, is the tool to ensure you didn't make obvious
mistakes - like returning an HTML page and thinking it's a number - weak
typing won't prevent anybody from discovering that.
Moreover, if your analysis is intended to help the optimizations - i.e.
after this point this variable is a known integer so I can drop all
IS_OBJECT checks and just add it to another integer - it would work too.
The only situation it wouldn't work is when you expect types to do
something types can't do anyway.
E.g. if even something obviously wrong like passing "foobar" to a
boolean argument cannot be detected as incorrect code (because it would
Why it's obviously wrong? There are many uses cases that passing
non-empty string to a boolean context is completely fine. For example:
$response = get_response();
if($response) {
ok();
do_stuff_with($response);
} else {
error();
}
It's completely fine and idiomatic PHP, why wouldn't you want to accept it?
be accepted by PHP), then I don't think doing type analysis would have
any value.
That's wrong, as I outlined above there are many cases where it could be
useful, so the approach of "do it my way or all is lost" is completely
unwarranted.
weak typehints on the other hand you will get away with working directly
on unvalidated input (like passing $_GET['id'] directly to an int-hinted
parameter). Usually. However when the script is called with ?id=abc you
will get a runtime error. That is what I'm referring to when I say "more
If "abc" is not a valid input for your logic, you'd get a runtime error
in any case.
Hack is very similar to PHP.
That's a weird case of logic inversion. You start with a claim that
"Hack is very similar to PHP", then you proceed to show a part of Hack
that is not very similar to PHP - namely, strict typing which PHP never
had - and then you claim this proves PHP should change. Maybe this just
proves Hack is not as similar to PHP as you previously thought?
If something works for Hack it is likely to
also work well for PHP.
Hack was written for one use case and its usage now is a tiny fraction
of PHP usage. We have no idea how anything Hack does would work on the
scale (both time and user-base wise) of PHP. Maybe it would work, maybe
it wouldn't. Claiming just because people designing Hack thought
something is a good idea automatically makes it a good idea for PHP is
completely baseless.
Be realistic. I'm not trying to sell you algebraic data types because
they're nice in Haskell, and I'm not trying to introduce Lisp macros
either. I'm talking about a language that is derived from PHP. I think
It is derived, that is true. But where it departs from PHP, there we
must be careful. Their derivation does not automatically makes PHP their
follower.
Hack is a nice playground for new PHP features and we should be
incorporating the parts that turned out to be useful.
For that, we need first to determine they are useful for PHP, not just
claim "Hack has it, so we must have it too". Especially when we talk
about the features which go contrary to the very principles on which PHP
is built. Maybe these principles do not work for you, maybe you'd prefer
a language with strict typing and more powerful type inference. That's
fine, there are languages for that. But when we're talking about making
PHP a strict-typed language, it's a different story and "other boys do
that" is not going to be a good argument.
correctness we allow only the specified type. I prefer correctness over
convenience, but I understand that others have different opinions on this.
When you say it's your preference, I completely understand you.
Everybody has their preferences and styles, some prefer more strict
languages, some prefer more relaxed one, some choose these and those
depending on context. But if we talk in terms of "majority", "useless
without it", etc. then there's much bigger disagreement than just
preferences.
Stas Malyshev
smalyshev@gmail.com
I don't necessarily have any more insight to provide than has already
been done, but I do want to chime in and say that I personally favor
strict types as Nikita Popov has been advocating.
I don't necessarily have any more insight to provide than has already
been done, but I do want to chime in and say that I personally favor
strict types as Nikita Popov has been advocating.
Same from me. I support all arguments so far made by Nikita; they're
well laid out, better than I could have done it.
I think the only real world usefulness is the strict typing; however:
Building on that I also think that the weak proposal using
function( (int) $a_number)
has it's place and could be very good for moving forward. It would give
both side of the camps their options.
But to move this forward in this line I think a new RFC would need to be
created because it's a different goal.
- Markus
I don't necessarily have any more insight to provide than has already
been done, but I do want to chime in and say that I personally favor
strict types as Nikita Popov has been advocating.Same from me. I support all arguments so far made by Nikita; they're
well laid out, better than I could have done it.
And I, on the other hand, disagree.
Looking at it from an OSS maintainer perspective, introducing strict
hints in code would be a huge BC break as I don't know how people use my
code, nor if they validate/coerce their user input early or not. If I
suddenly declare something as int and someone used to pass '5', they get
an error. That would make adoption quite hard in OSS IMO if you don't
want to bother people.
With weak typing on the other hand, I get to remove tons of @param
annotations which would be amazing, and I get additional safeties that
new mis-uses (like passing an object in an arg expecting a string,
which at the moment could not be type hinted) will be caught early in
the future.
Existing mis-uses probably don't exist much since they'd likely fail
further down in the code execution already, but in case they don't
they'd be caught too and that's a good thing since it makes people look
at their buggy code.
As for valid uses of the code, like the '5' above, it keeps working just
fine because it is fine. It's not my call as a library author to decide
whether the users of the lib should cast user inputs to ints or validate
them or do nothing at all.
Besides, IMO if strict typing was so desirable we would see more of the
String/Int classes out there, or we would see people do things like:
function foo($number, $string) {
expectNumber($number); expectString($string);
}
That'd be a simple lib to write, it barely adds more code to your
functions, and you get strict typing. But I hardly see anyone do this,
and I would argue it's because it sounds appealing in theory but it's
not worth it in practice.
Cheers
--
Jordi Boggiano
@seldaek - http://nelm.io/jordi
Hey Jordi,
And I, on the other hand, disagree.
Looking at it from an OSS maintainer perspective, introducing strict hints in code would be a huge BC break as I don't know how people use my code, nor if they validate/coerce their user input early or not. If I suddenly declare something as int and someone used to pass '5', they get an error. That would make adoption quite hard in OSS IMO if you don't want to bother people.
With weak typing on the other hand, I get to remove tons of @param annotations which would be amazing, and I get additional safeties that new mis-uses (like passing an object in an arg expecting a string, which at the moment could not be type hinted) will be caught early in the future.
Existing mis-uses probably don't exist much since they'd likely fail further down in the code execution already, but in case they don't they'd be caught too and that's a good thing since it makes people look at their buggy code.
As for valid uses of the code, like the '5' above, it keeps working just fine because it is fine. It's not my call as a library author to decide whether the users of the lib should cast user inputs to ints or validate them or do nothing at all.
This is an excellent point which had never occurred to me before! Yes, weakly-typed hints like these are far less likely to break existing code. I should probably mention that in the RFC.
Thanks!
Andrea Faulds
http://ajf.me/
Looking at it from an OSS maintainer perspective, introducing strict hints
in code would be a huge BC break as I don't know how people use my code,
nor if they validate/coerce their user input early or not. If I suddenly
declare something as int and someone used to pass '5', they get an error.
That would make adoption quite hard in OSS IMO if you don't want to bother
people.
Regardless if this RFC gets through or not, changing the signature of a
method IS a BC break in any case, so you shouldn't just move your int
declaration from the docblocks into the method signature.
That'd be a simple lib to write, it barely adds more code to your
functions, and you get strict typing. But I hardly see anyone do this, and
I would argue it's because it sounds appealing in theory but it's not worth
it in practice.
I'm actually already using https://github.com/beberlei/assert for that, and
I'd like to get rid of it for simpler types.
Marco Pivetta
Regardless if this RFC gets through or not, changing the signature of a
method IS a BC break in any case, so you shouldn't just move yourint
declaration from the docblocks into the method signature.
On public methods yes although a lot of internal public methods are
probably safe-ish, or would only affect people extending/overriding.
Private methods though could still benefit from this perfectly safely,
but if the public methods route params to private methods with strict
hints you still get issues.
I'm actually already using https://github.com/beberlei/assert for that,
and I'd like to get rid of it for simpler types.
Interesting, I didn't know about this, and it's a cool way to get more
specialized assertions than plain type hints allow. IMO with the RFC as
it stands you could still get rid of it for simple types though because
it's not your job to validate user input. You need an int ask for an int
and get an int. If users like convenience they can skip validation, if
they like strictness they can do their own layer of Assertion::foo calls
in their controllers.
Cheers
--
Jordi Boggiano
@seldaek - http://nelm.io/jordi
Hello Jordi,
Looking at it from an OSS maintainer perspective, introducing strict
hints in code would be a huge BC break as I don't know how people use my
code, nor if they validate/coerce their user input early or not. If I
suddenly declare something as int and someone used to pass '5', they get
an error. That would make adoption quite hard in OSS IMO if you don't
want to bother people.
I'd argue if your concern is so big, you just wouldn't do it ... it's up
to the author how he sees it fit and wants to use it. You could make all
your public facing methods don't use them but only use it internally
(aha, just saw someone mentioned this in another email too; weird).
With weak typing on the other hand, I get to remove tons of @param
annotations which would be amazing
[...]
I hope you don't because @param should also be describing the
purpose/usage of the parameter ;-)
It's not my call as a library author to decide
whether the users of the lib should cast user inputs to ints or validate
them or do nothing at all.
That's really interesting, because I as library author actually do
provide "how to use my library". I mean:
- I create it with my vision
- I document it
- I create the examples how to use it
- I decide what's private/protected/public, i.e. how extensible things are
Well, there's lot of "I" but that would also apply to teams and their
decision how to use it.
My want-to-be-on-the-safe-side-code currently is either riddled with
explicit casts or simple nothing; I try to keep my methods small and
that overhead could outweigh code logic sometimes which I'd consider
counterproductive.
Besides, IMO if strict typing was so desirable we would see more of the
String/Int classes out there, or we would see people do things like:
Well, performance; manual boxing/unboxing is IMHO not really feasible to
do with the interpreter currently (ok; gut feeling here, no benchmark).
So, yes, I would use it if the overall overhead incurred would IMHO just
not make it feasible. (sorry for repeating)
function foo($number, $string) { expectNumber($number); expectString($string); }
That'd be a simple lib to write, it barely adds more code to your
functions, and you get strict typing. But I hardly see anyone do this,
and I would argue it's because it sounds appealing in theory but it's
not worth it in practice.
I'd love to to that but simply also don't want to clutter my code with
such an approach.
The current situation to me is a bit like: it bothers me, but not to the
point I'd clutter my code with an approach like you've shown (and was
tempted multiple times in the past to do ...)
Although it seems like I wrote so many "counter" argument, please don't
misinterpret. For the love of history I merely told my view and I'm
actually glad you shared yours; with this important topic for PHP,
however it ends, there can't be too many views on that matter.
thank you,
- Markus
Syntax forced cast, it makes sense not only for scalar types, but also for
instances of classes.
Is a syntactic sugar can be realized
<?php
class User
{
public function __construct(int $id)
{
......
}
}
function printPerson((User) $object)
{
var_dump($object);
return ($object instanceof User);
}
printPerson(new User(101)); // TRUE
printPerson(101); // TRUE
// scalar int = 101, sent to constructor method class User, the resulting
instance User passed to the function printPerson
thanks for the rfc! Currently we allow:
function test(Array $o){}
test([]);
function test2(array $o){}
test2([]);
So I propose to allow also uppercase type names:
ublic function __construct(String $name, Int $age, Float $cuteness, Bool $evil) {
Regards
Thomas
Andrea Faulds wrote on 31.12.2014 21:27:
Good evening,
Parameter type hints for PHP’s scalar types are a long-requested feature for
PHP. Today I am proposing an RFC which is a new attempt to add them to the
language. It is my hope that we can finally get this done for PHP 7.I’d like to thank Dmitry, who emailed me and gave me some feedback on the
draft RFC and some improvements to the patch. He encouraged me to put this to
internals sooner rather than later, as it’s a feature many people are hoping
will be in PHP 7.The new RFC can be found here: https://wiki.php.net/rfc/scalar_type_hints
As well as the RFC, there is a working Zend Engine III patch with tests, and an
incomplete specification patch.Please read the RFC (and specification patch, if you wish) and tell me your
thoughts.Thanks!
Andrea Faulds
http://ajf.me/
Hi Thomas,
thanks for the rfc! Currently we allow:
function test(Array $o){}
test([]);
function test2(array $o){}
test2([]);So I propose to allow also uppercase type names:
ublic function __construct(String $name, Int $age, Float $cuteness, Bool $evil) {
The RFC doesn’t mention it, but this is allowed by the patch, since type names are case-insensitive.
Thanks.
Andrea Faulds
http://ajf.me/
Please read the RFC (and specification patch, if you wish) and tell me your thoughts.
My thoughts as a long time (non-voting) PHP user:
I'd much prefer if they were strict types. I have a bias toward this
because I prefer to be as strict as possible. It's not hard to type
my_func((int) $_GET['foo']) if I want easy conversion. This explicit
cast serves as a reminder that I'm being sloppy. I can quickly scan
and see if there are parts of my code that are more prone to bugs.
But I'm not opposed to the RFC. I think it's way better than
nothing, and I understand the arguments in its favor. My one
complaint: I don't like that it emits a notice. I treat notices as
broken code, and if it's proper to say (int) "7 things", then it ought
to be proper to send that into a function. With notices being a
possibility, I'll need to manually add (int) in front of everything
... at which point, we might as well have strict types. (Off-topic and
different RFC, but I think having return types that auto-cast is
weird, which is another reason I'd prefer strict types all around.)
Finally, I'm not sure that implementing both (strict / type juggling)
with different syntax is a good idea. I think I'd prefer one or the
other. I don't really want to keep track as a user of many composer
libs (etc) which authors decided I need to use strict types. Because I
wouldn't want to have two different styles of code depending on the
library I'm using, I'd end up again going back to assuming everything
was a strict type.
--
Matthew Leverton
Hey Matthew,
I'd much prefer if they were strict types. I have a bias toward this
because I prefer to be as strict as possible. It's not hard to type
my_func((int) $_GET['foo']) if I want easy conversion. This explicit
cast serves as a reminder that I'm being sloppy. I can quickly scan
and see if there are parts of my code that are more prone to bugs.
One problem with explicit casts like (int) is they never throw errors, unlike implicit casts. takes_int([]) is an error, (int)[] isn’t. There was a safe casting functions RFC, but that was rejected. Still, yes, you have a point in that the sloppiness is more obvious.
But I'm not opposed to the RFC. I think it's way better than
nothing, and I understand the arguments in its favor. My one
complaint: I don't like that it emits a notice. I treat notices as
broken code, and if it's proper to say (int) "7 things", then it ought
to be proper to send that into a function. With notices being a
possibility, I'll need to manually add (int) in front of everything
... at which point, we might as well have strict types. (Off-topic and
different RFC, but I think having return types that auto-cast is
weird, which is another reason I'd prefer strict types all around.)
I’m not a fan of this behaviour either, but it is the behaviour we already have. Another RFC could fix it. I should probably write one.
Finally, I'm not sure that implementing both (strict / type juggling)
with different syntax is a good idea. I think I'd prefer one or the
other. I don't really want to keep track as a user of many composer
libs (etc) which authors decided I need to use strict types. Because I
wouldn't want to have two different styles of code depending on the
library I'm using, I'd end up again going back to assuming everything
was a strict type.
This is how I feel.
Thanks.
Andrea Faulds
http://ajf.me/
hi Andrea,
Good evening,
Parameter type hints for PHP’s scalar types are a long-requested feature for PHP. Today I am proposing an RFC which is a new attempt to add them to the language. It is my hope that we can finally get this done for PHP 7.
I’d like to thank Dmitry, who emailed me and gave me some feedback on the draft RFC and some improvements to the patch. He encouraged me to put this to internals sooner rather than later, as it’s a feature many people are hoping will be in PHP 7.
The new RFC can be found here: https://wiki.php.net/rfc/scalar_type_hints
As well as the RFC, there is a working Zend Engine III patch with tests, and an incomplete specification patch.
Please read the RFC (and specification patch, if you wish) and tell me your thoughts.
Thanks, great work and persistent effort!
As I am also slightly in favor of a strict way, this RFC is a good compromise.
Some comments:
- on-numeric strings not accepted. Numeric strings with trailing
characters are accepted, but produce a notice.
I would rather not allow fancy conversions here. Any trailing non
white spaces characters should not be allowed. I know it is not what
PHP does now in some cases but this is really a fuzzy area and never
really matched any actual needs or usages.
I am also not a fan of errors, exception, at least for methods, make
much more sense. I know it is relatively easy to handle errors as
exception but still, let do it right now.
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
Hi!
I am also not a fan of errors, exception, at least for methods, make
much more sense. I know it is relatively easy to handle errors as
exception but still, let do it right now.
Then we need to convert all parameter mismatch errors (or at least all
fatal ones) to exceptions. Which I don't think makes a lot of sense to
put in the same RFC but in any case it should work the same way over all
the engine. Otherwise handling it would be a nightmare.
Stas Malyshev
smalyshev@gmail.com
Hi!
I am also not a fan of errors, exception, at least for methods, make
much more sense. I know it is relatively easy to handle errors as
exception but still, let do it right now.Then we need to convert all parameter mismatch errors (or at least all
fatal ones) to exceptions. Which I don't think makes a lot of sense to
put in the same RFC but in any case it should work the same way over all
the engine. Otherwise handling it would be a nightmare.
I am not sure about that. Parameters handling is one specific case,
userland parameter handling even more. It could be a good move to do
that as a 1st step, with this RFC.
--
Pierre
@pierrejoye | http://www.libgd.org
Hi!
I am not sure about that. Parameters handling is one specific case,
userland parameter handling even more. It could be a good move to do
Making internal and userland parameters work differently and having
scalar type errors behave differently from object type errors by having
one throw exceptions and another errors looks like a mistake to me. It
only makes handling errors harder as you'd have to handle two types of
situations instead of one.
that as a 1st step, with this RFC.
I don't think "1st step" is a good approach here. The language should
provide consistent expectations, including about what happens when you
pass certain data to it, including error conditions. If we have
different types of error conditions between internal and userland
functions, it would not be a good thing.
We should make all parameter handling work the same way - so if you pass
a parameter and it does not match the expectations, you know what you're
getting. If it works one way to internals, another for user functions,
it would only make it harder to handle.
Stas Malyshev
smalyshev@gmail.com
Hi!
I am not sure about that. Parameters handling is one specific case,
userland parameter handling even more. It could be a good move to doMaking internal and userland parameters work differently and having
scalar type errors behave differently from object type errors by having
one throw exceptions and another errors looks like a mistake to me. It
only makes handling errors harder as you'd have to handle two types of
situations instead of one.
I was not clear, sorry.
Parameter handling in general as described in this RFC are a special
case. This addition is somehow different from other arguments handling
we have, whether it is used for internal functions or userland. I only
see Exceptions even more useful in userland code.
There is no change per se to existing functions. Or to say it in a
better way, I am not keen to begin to chagne every 2nd internal
function to apply this new RFC, it could cause some bad headaches.
However, new functions and the likes, explicitally relying on this
RFC, may be a good candidate to have a different handling for bad
argument.
that as a 1st step, with this RFC.
I don't think "1st step" is a good approach here. The language should
provide consistent expectations, including about what happens when you
pass certain data to it, including error conditions. If we have
different types of error conditions between internal and userland
functions, it would not be a good thing.
Agreed, but still. New userland codes could have huge benefits if we
allow that. But changing every internal function may have a bad
impact.
We should make all parameter handling work the same way - so if you pass
a parameter and it does not match the expectations, you know what you're
getting. If it works one way to internals, another for user functions,
it would only make it harder to handle.
Right again. Still not sure about the perfect solution without
impacting too much existing userland code.
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
I am also not a fan of errors, exception, at least for methods, make
much more sense. I know it is relatively easy to handle errors as
exception but still, let do it right now.
I second this; this could be an excellent opportunity into that
direction favored by many developers here (judging by the number of
times this and related "exception" topics come up on the list) but yet
we lacked the "how to start"/"how to get it right" approach and this
could be the gradual start of the shift.
thanks for bringing this up,
- Markus
Hey Pierre,
Thanks, great work and persistent effort!
Thank you.
As I am also slightly in favor of a strict way, this RFC is a good compromise.
Some comments:
- on-numeric strings not accepted. Numeric strings with trailing
characters are accepted, but produce a notice.I would rather not allow fancy conversions here. Any trailing non
white spaces characters should not be allowed. I know it is not what
PHP does now in some cases but this is really a fuzzy area and never
really matched any actual needs or usages.
Yeah, I don’t like this behaviour much. I want to avoid inconsistency with the behaviour of extension functions (i.e. zend_parse_parameters) where possible, though. Since this has come up so much, I should probably make an RFC to change this aspect of ZPP’s behaviour.
I am also not a fan of errors, exception, at least for methods, make
much more sense. I know it is relatively easy to handle errors as
exception but still, let do it right now.
That’s be inconsistent with our other type hints. To get this changed, I think we’ll just have to wait for Nikita’s Exceptions in the Engine for PHP 7 RFC.
Thanks.
Andrea Faulds
http://ajf.me/
- on-numeric strings not accepted. Numeric strings with trailing
characters are accepted, but produce a notice.I would rather not allow fancy conversions here. Any trailing non
white spaces characters should not be allowed. I know it is not what
PHP does now in some cases but this is really a fuzzy area and never
really matched any actual needs or usages.Yeah, I don’t like this behaviour much. I want to avoid inconsistency with the behaviour of extension functions (i.e. zend_parse_parameters) where possible, though. Since this has come up so much, I should probably make an RFC to change this aspect of ZPP’s behaviour.
That RFC should probably be voted on before this RFC proceeds into voting.
After skimming through the RFC I'm unsure what the following code would produce:
function test(int $a, int $b) {}
test("10.4", 10.6);
If a warning/notice is raised, fine. If it will just result in $a ===
(int)10 && $b ===(int)10 :
I agree that consistency is a sound reason for this behavior. However
type annotations (or whatever they will be called) should not only
ensure that a parameter has a certain type but that no data was lost
during the conversion. It is reasonable (assuming the target type can
handle the value) and fine to convert an integer into a float, a
float/integer to a string or any string that would satisfy is_numeric
into a float (or if no data was lost an integer) and hide it from the
developer (after all, that is was php is about). If a conversion might
result in data or precision loss the developer should be notified with
a hard failure. (E_RECOV, E_WARNING, I can still chose to ignore them
using error handlers or error_reporting settings).
Allowing data (I would also consider (string)'12a' => int(12) data
loss) or precision loss while converting a typed parameter would
reduce the usability of the whole addition.
If I'm in a situation where data loss is acceptable I could just leave
the annotation away and cast as I would've always done or not cast and
trust on php to do whatever is required.
However caring about data loss in these specific occasions( including
the string -> int example above) would forbid me to use type
annotations. As I would have no way of knowing and bailing out if the
original value was altered before it was accessible by my code.
If type annotations wouldn't go beyond the safety that php currently
provides by converting in specific contexts and casting what is the
benefit in adding them?
However, while I see issues with the RFC (that might be related to a
misunderstanding on my side) I like the general direction.
I am also not a fan of errors, exception, at least for methods, make
much more sense. I know it is relatively easy to handle errors as
exception but still, let do it right now.That’s be inconsistent with our other type hints. To get this changed, I think we’ll just have to wait for Nikita’s Exceptions in the Engine for PHP 7 RFC.
Here is what I run on my ubuntu 14.04 for compiling & testing:
apt-get install build-essential re2c bison
git clone -b master https://github.com/php/php-src.git
cd php-src
curl https://github.com/php/php-src/pull/972.patch | git am
./buildconf
./configure
make
sapi/cli/php /tmp/test.php
test.php is my testing script.
Regards
Thomas
Sebastian B.-Hagensen wrote on 03.01.2015 20:46:
- on-numeric strings not accepted. Numeric strings with trailing
characters are accepted, but produce a notice.I would rather not allow fancy conversions here. Any trailing non
white spaces characters should not be allowed. I know it is not what
PHP does now in some cases but this is really a fuzzy area and never
really matched any actual needs or usages.Yeah, I don’t like this behaviour much. I want to avoid inconsistency with
the behaviour of extension functions (i.e. zend_parse_parameters) where
possible, though. Since this has come up so much, I should probably make an
RFC to change this aspect of ZPP’s behaviour.That RFC should probably be voted on before this RFC proceeds into voting.
After skimming through the RFC I'm unsure what the following code would
produce:function test(int $a, int $b) {}
test("10.4", 10.6);
If a warning/notice is raised, fine. If it will just result in $a ===
(int)10 && $b ===(int)10 :I agree that consistency is a sound reason for this behavior. However
type annotations (or whatever they will be called) should not only
ensure that a parameter has a certain type but that no data was lost
during the conversion. It is reasonable (assuming the target type can
handle the value) and fine to convert an integer into a float, a
float/integer to a string or any string that would satisfy is_numeric
into a float (or if no data was lost an integer) and hide it from the
developer (after all, that is was php is about). If a conversion might
result in data or precision loss the developer should be notified with
a hard failure. (E_RECOV, E_WARNING, I can still chose to ignore them
using error handlers or error_reporting settings).Allowing data (I would also consider (string)'12a' => int(12) data
loss) or precision loss while converting a typed parameter would
reduce the usability of the whole addition.If I'm in a situation where data loss is acceptable I could just leave
the annotation away and cast as I would've always done or not cast and
trust on php to do whatever is required.
However caring about data loss in these specific occasions( including
the string -> int example above) would forbid me to use type
annotations. As I would have no way of knowing and bailing out if the
original value was altered before it was accessible by my code.If type annotations wouldn't go beyond the safety that php currently
provides by converting in specific contexts and casting what is the
benefit in adding them?However, while I see issues with the RFC (that might be related to a
misunderstanding on my side) I like the general direction.I am also not a fan of errors, exception, at least for methods, make
much more sense. I know it is relatively easy to handle errors as
exception but still, let do it right now.That’s be inconsistent with our other type hints. To get this changed, I
think we’ll just have to wait for Nikita’s Exceptions in the Engine for
PHP 7 RFC.
Hi Thomas,
Here is what I run on my ubuntu 14.04 for compiling & testing:
apt-get install build-essential re2c bison
git clone -b master https://github.com/php/php-src.git
cd php-src
curl https://github.com/php/php-src/pull/972.patch | git am
./buildconf
./configure
make
sapi/cli/php /tmp/test.phptest.php is my testing script.
I wouldn’t advise using the patch against master, since the branch may be slightly out-of-date compared to master if I haven’t recently merged it in or rebased it. Instead, I suggest you check out the branch itself.
Thanks.
Andrea Faulds
http://ajf.me/
Hi Sebastian,
Yeah, I don’t like this behaviour much. I want to avoid inconsistency with the behaviour of extension functions (i.e. zend_parse_parameters) where possible, though. Since this has come up so much, I should probably make an RFC to change this aspect of ZPP’s behaviour.
That RFC should probably be voted on before this RFC proceeds into voting.
That would be the idea.
After skimming through the RFC I'm unsure what the following code would produce:
function test(int $a, int $b) {}
test("10.4", 10.6);
If a warning/notice is raised, fine. If it will just result in $a ===
(int)10 && $b ===(int)10 :
The first argument would be converted to 10 and a notice (“Non well formed numeric string”). The second would also be converted to 10, but silently.
I agree with your sentiments about data loss, but I am reluctant to deviate much from the behaviour of internal functions to avoid the inconsistency that plagued the previous RFC.
If type annotations wouldn't go beyond the safety that php currently
provides by converting in specific contexts and casting what is the
benefit in adding them?
They’re still much safer than what we currently have. An unhinted parameter will accept anything. A scalar hinted parameter won’t accept non-scalars, and will only accept certain scalars.
Plus, they also avoid the need to use things like docblocks for many functions where merely adding types would make them self-explanatory.
Thanks.
--
Andrea Faulds
http://ajf.me/
Hi,
Hi Sebastian,
On 3 Jan 2015, at 19:46, Sebastian B.-Hagensen sbj.ml.read@gmail.com
wrote:Yeah, I don’t like this behaviour much. I want to avoid inconsistency
with the behaviour of extension functions (i.e. zend_parse_parameters)
where possible, though. Since this has come up so much, I should probably
make an RFC to change this aspect of ZPP’s behaviour.That RFC should probably be voted on before this RFC proceeds into
voting.That would be the idea.
After skimming through the RFC I'm unsure what the following code would
produce:function test(int $a, int $b) {}
test("10.4", 10.6);
If a warning/notice is raised, fine. If it will just result in $a ===
(int)10 && $b ===(int)10 :The first argument would be converted to 10 and a notice (“Non well
formed numeric string”). The second would also be converted to 10, but
silently.
Hm. It sounds bad. There is a data loss, a notice must be raised. This is
exactly the kind of magic conversion that should not happen for arguments.
I agree with your sentiments about data loss, but I am reluctant to
deviate much from the behaviour of internal functions to avoid the
inconsistency that plagued the previous RFC.
Right, but this is what I would expect. Am I the only one?
On 04.01.15 04:43, Pierre Joye wrote:>
I agree with your sentiments about data loss, but I am reluctant to
deviate much from the behaviour of internal functions to avoid the
inconsistency that plagued the previous RFC.Right, but this is what I would expect. Am I the only one?
Definitely not; since strict types would elevate such problems that's my
preference anyway.
- Markus
Hi everyone,
Just a few small updates.
I’ve made a small change to this RFC. Instead of the strict mode syntax being declare(strict_typehints=TRUE), it’s now declare(strict_types=1) instead. This makes it a bit quicker to type - important given you’d need to type it a lot - without sacrificing much readability. It also avoids using the words “type hint”, which I understand are contentious to some people.
The patch now includes a more extensive set of tests: https://github.com/TazeTSchnitzel/php-src/tree/scalar_type_hints_2_strict_mode/Zend/tests/typehints/
The pull request is also now green on Travis (aside from two failures on the debug build - but they also fail in master).
Levi’s Return Types RFC has now passed, so now the RFC needs to cover that. The RFC currently doesn’t discuss return types, save for a paragraph in the Future Scope section. When the return types patch is merged, I’ll update the scalars patch, and then update the RFC. A point of contention here is whether or not return types should always be strict, or whether they should also obey the strict/weak modes. I’m not entirely sure on that one, that specific item may end up going to a vote. That said, I do lean towards always strict, given you can usually ensure your return type’s correct.
Thanks.
--
Andrea Faulds
http://ajf.me/
PS: It’s rumoured that a certain other female internals developer may have endorsed this RFC at a certain recent PHP conference. That’s good to hear!
Hi!
going to a vote. That said, I do lean towards always strict, given
you can usually ensure your return type’s correct.
Wait, so we would have two modes, strict and non-strict, but also in
non-strict mode, return types still will be strict? Yay, consistency!
--
Stas Malyshev
smalyshev@gmail.com
Hey Stas,
Hi!
going to a vote. That said, I do lean towards always strict, given
you can usually ensure your return type’s correct.Wait, so we would have two modes, strict and non-strict, but also in
non-strict mode, return types still will be strict? Yay, consistency!
Yes, it would have that inconsistency, so there’s also the other possibility of being weak for return types in weak mode.
Thing is, I haven’t seen (so far) anyone who seems to think return types should be converted. We don’t do this for internal functions (to be fair, there’s no need, C is statically-typed). Them being strict would follow the robustness principle, too: “be conservative in what you send, be liberal in what you accept”.
On the other hand, it may not be terribly fitting with “PHP’s weakly-typed nature”.
It’s hard to say.
--
Andrea Faulds
http://ajf.me/
Hi everyone,
Just a few small updates.
I’ve made a small change to this RFC. Instead of the strict mode syntax being declare(strict_typehints=TRUE), it’s now declare(strict_types=1) instead. This makes it a bit quicker to type - important given you’d need to type it a lot - without sacrificing much readability. It also avoids using the words “type hint”, which I understand are contentious to some people.
The patch now includes a more extensive set of tests: https://github.com/TazeTSchnitzel/php-src/tree/scalar_type_hints_2_strict_mode/Zend/tests/typehints/
The pull request is also now green on Travis (aside from two failures on the debug build - but they also fail in master).
Levi’s Return Types RFC has now passed, so now the RFC needs to cover that. The RFC currently doesn’t discuss return types, save for a paragraph in the Future Scope section. When the return types patch is merged, I’ll update the scalars patch, and then update the RFC. A point of contention here is whether or not return types should always be strict, or whether they should also obey the strict/weak modes. I’m not entirely sure on that one, that specific item may end up going to a vote. That said, I do lean towards always strict, given you can usually ensure your return type’s correct.
Thanks.
--
Andrea Faulds
http://ajf.me/
Hello,
personally I still don't like this RFC in it's current form and
"shorter" declare won't change it. I was thinking a lot about the
typehints in PHP for last few days and I think having only one way
would be the best - and it's somewhere between the curent weak and
strict typing. My main "issue" is that the current weak typing is too
loose and the strict typing is too strict.
The problem with the current strict typing is that you cannot pass
"int" to a "float" parameter, even though there can be a totally
lossless conversion and it works in other strongly typed languages.
And being able to pass a float(1.5) to int and lose the 0.5 value
doesn't make sense as well, because data will get lost. Neither of
those feels somehow "predictable" and "natural".
Also, after a little bit of thinking, if someone needs to do a type
conversion while calling a method, writing foo((int) $bar) isn't that
hard.
So, I think it would be best to choose just one of these two
approaches and either loosen it a little or make it more strict (so
data loss doesn't happen). But I guess this approach would be
inconsistent with how the built-in PHP functions work?
PS: Ideally, the data loss rules should be made for types and not
values (like the old scalar type hints RFC had), so you don't get
unpredictable results. The only ones I can think of right now are
basically int -> bool, int -> float, object (w/ __toString) -> string,
int -> string, float -> string?
Regards
Pavel Kouril
Hi Pavel,
personally I still don't like this RFC in it's current form and
"shorter" declare won't change it.
I didn’t expect that making it shorter would really change anyone’s opinions, except perhaps those who don’t like the term “type hint”.
I was thinking a lot about the
typehints in PHP for last few days and I think having only one way
would be the best - and it's somewhere between the curent weak and
strict typing. My main "issue" is that the current weak typing is too
loose and the strict typing is too strict.The problem with the current strict typing is that you cannot pass
"int" to a "float" parameter, even though there can be a totally
lossless conversion and it works in other strongly typed languages.
It can sometimes be a lossless conversion. Only sometimes.
For float to int conversion:
- Floats have the special values INF,
NAN
and -NAN, which cannot be preserved - Floats have negative zero, which also cannot be preserved
- Fractional components cannot be preserved
- Floats sacrifice precision to allow a wider range of values. They work with significant figures (scientific notation), unlike integers which always offer full precision. So a particular float value isn’t necessarily equivalent to a particular integer value, you have to invent precision to do the conversion. 2e10 is dealt with as if it’s 2 with 10 zeroes after it, but it’s just a number beginning with 2 that has a magnitude of roughly 10^10. If you convert it to the integer value 20 000 000 000, you’ve just invented values for those trailing digits - those digits weren’t necessarily zero, we just don’t know what those digits are. Someone who’s an expert on floating-point might need to correct me here, but I think this is correct to some extent. What I’m saying is that float->integer conversion is inherently imprecise.
For int to float conversion:
- Values beyond 2^53 or below -2^53 cannot be represented as floats without a loss of precision
Some strongly-typed languages allow these conversions implicitly, but I’m not sure that’s a good thing or something we should want to copy. Loss of precision isn’t good. If you ask for strict typing, you probably want to avoid it, and should get strict typing.
And being able to pass a float(1.5) to int and lose the 0.5 value
doesn't make sense as well, because data will get lost. Neither of
those feels somehow "predictable" and "natural”.
Sure, but it is our existing behaviour.
Also, after a little bit of thinking, if someone needs to do a type
conversion while calling a method, writing foo((int) $bar) isn't that
hard.
This isn’t a good idea. Explicit casts do not care for what value you give them, they will convert whether or not the conversion makes sense.
Unfortunately we don’t have safe casting functions because they were rejected. Alas.
So, I think it would be best to choose just one of these two
approaches and either loosen it a little or make it more strict (so
data loss doesn't happen). But I guess this approach would be
inconsistent with how the built-in PHP functions work?
While it never went to a vote, the Scalar Type Hinting with Casts RFC, which proposed stricter weak casts, was not well-received. The Safe Casting Functions RFC was rejected when it went to a vote.
PS: Ideally, the data loss rules should be made for types and not
values (like the old scalar type hints RFC had), so you don't get
unpredictable results.
The Scalar Type Hinting with Casts RFC didn’t do that, it was also based on values.
In a dynamically-typed language like PHP, I don’t see why it needs to be type-based rather than value-based.
Thanks.
--
Andrea Faulds
http://ajf.me/
Hi Pavel,
Hi, thanks for explaining some things.
It can sometimes be a lossless conversion. Only sometimes.
For float to int conversion:
- Floats have the special values INF,
NAN
and -NAN, which cannot be preserved- Floats have negative zero, which also cannot be preserved
- Fractional components cannot be preserved
- Floats sacrifice precision to allow a wider range of values. They work with significant figures (scientific notation), unlike integers which always offer full precision. So a particular float value isn’t necessarily equivalent to a particular integer value, you have to invent precision to do the conversion. 2e10 is dealt with as if it’s 2 with 10 zeroes after it, but it’s just a number beginning with 2 that has a magnitude of roughly 10^10. If you convert it to the integer value 20 000 000 000, you’ve just invented values for those trailing digits - those digits weren’t necessarily zero, we just don’t know what those digits are. Someone who’s an expert on floating-point might need to correct me here, but I think this is correct to some extent. What I’m saying is that float->integer conversion is inherently imprecise.
For int to float conversion:
- Values beyond 2^53 or below -2^53 cannot be represented as floats without a loss of precision
Some strongly-typed languages allow these conversions implicitly, but I’m not sure that’s a good thing or something we should want to copy. Loss of precision isn’t good. If you ask for strict typing, you probably want to avoid it, and should get strict typing.
And being able to pass a float(1.5) to int and lose the 0.5 value
doesn't make sense as well, because data will get lost. Neither of
those feels somehow "predictable" and "natural”.Sure, but it is our existing behaviour.
Yeah, as I said, implicit float to int is IMHO bad and I personally
don't like it much, because data loss sucks. But being consistent with
existing behavior is probably the right way to do stuff.
About the problem with int to float and loss of precision beyond 2^53:
I didn't realize that, was thinking just about 32bit integers when I
wrote that. But now I wonder how other languages do it, when they are
implicitly converting 64bit integers to double precision floating
point numbers.
Unfortunately we don’t have safe casting functions because they were rejected. Alas.
So, I think it would be best to choose just one of these two
approaches and either loosen it a little or make it more strict (so
data loss doesn't happen). But I guess this approach would be
inconsistent with how the built-in PHP functions work?While it never went to a vote, the Scalar Type Hinting with Casts RFC, which proposed stricter weak casts, was not well-received. The Safe Casting Functions RFC was rejected when it went to a vote.
Oh, I didn't know that stricter weak casts were not well-recieved,
because I didn't read internals mailing list back then. But if that's
the case, I would gladly see the weak variant of this RFC accepted.
Have you thought about splitting this RFC into two? One for adding the
weak version and another one for adding the declare strict statement?
The Scalar Type Hinting with Casts RFC didn’t do that, it was also based on values.
In a dynamically-typed language like PHP, I don’t see why it needs to be type-based rather than value-based.
Maybe I wrote it wrong; I knew the old RFC had conversions based on
values. I just thought the rules based on types (you definitely know
what you need to convert and what not before calling a function) would
make much more sense, but the problem with 2^53 means that the
typed-based conversions are not a great solution either.
Pavel Kouril