Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:101531 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 15662 invoked from network); 4 Jan 2018 13:46:14 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 4 Jan 2018 13:46:14 -0000 Authentication-Results: pb1.pair.com header.from=michal@brzuchalski.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=michal@brzuchalski.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain brzuchalski.com designates 188.165.245.118 as permitted sender) X-PHP-List-Original-Sender: michal@brzuchalski.com X-Host-Fingerprint: 188.165.245.118 ns220893.ip-188-165-245.eu Received: from [188.165.245.118] ([188.165.245.118:37616] helo=poczta.brzuchalski.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 9F/C9-23177-D103E4A5 for ; Thu, 04 Jan 2018 08:46:08 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by poczta.brzuchalski.com (Postfix) with ESMTP id B4F202984236 for ; Thu, 4 Jan 2018 14:45:59 +0100 (CET) Received: from poczta.brzuchalski.com ([127.0.0.1]) by localhost (poczta.brzuchalski.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Pp5as1BR5KcX for ; Thu, 4 Jan 2018 14:45:55 +0100 (CET) Received: from mail-ot0-f176.google.com (unknown [74.125.82.176]) by poczta.brzuchalski.com (Postfix) with ESMTPSA id 35C4D2984233 for ; Thu, 4 Jan 2018 14:45:55 +0100 (CET) Received: by mail-ot0-f176.google.com with SMTP id p31so1327866ota.4 for ; Thu, 04 Jan 2018 05:45:55 -0800 (PST) X-Gm-Message-State: AKGB3mLUMdLL/mbmRqYgLz7vbxiL8ZtHy3Rh99nA4N3D8+IbaLc+iJ9t A1n8gJQ7OX7Ik9CeS/FDaHsBJw88iuLs8zm/UTk= X-Google-Smtp-Source: ACJfBougZx6hZNG+563oMGwBUD5mLyMPw0vo3UewbFlZ0qQGSs//X99dfc4I5wQ6uTm/tayaj1qlFzcWCk4xSo0t+tE= X-Received: by 10.157.89.231 with SMTP id u39mr2665122otg.270.1515073554137; Thu, 04 Jan 2018 05:45:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.157.37.237 with HTTP; Thu, 4 Jan 2018 05:45:53 -0800 (PST) In-Reply-To: References: Date: Thu, 4 Jan 2018 14:45:53 +0100 X-Gmail-Original-Message-ID: Message-ID: To: Michael Morris Cc: PHP internals Content-Type: multipart/alternative; boundary="f4030435b69400cb340561f38dd1" Subject: Re: [PHP-DEV][RFC][DISCUSSION] Strong Typing Syntax From: michal@brzuchalski.com (=?UTF-8?Q?Micha=C5=82_Brzuchalski?=) --f4030435b69400cb340561f38dd1 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable 2018-01-04 3:37 GMT+01:00 Michael Morris : > Second Draft based on the feedback upstream. > > Target version: PHP 8. > > This is a proposal to strengthen the dynamic type checking of PHP during > development. > > Note - this is not a proposal to change PHP to a statically typed languag= e > or to remove PHP's current loose typing rules. PHP is a weakly typed > language for a reason, and will remain so subsequent to this RFC. This RF= C > is concerned with providing tools to make controlling variable types > stronger when the programmer deems this necessary. > > > > VARIABLE DECLARATION > > PHP currently has no keyword to initialize a variable - it is simply > created when it is first referenced. The engine infers the appropriate ty= pe > for the variable, and this may be later cast to other types depending on > the context of the code. Objects can have magic functions to carry out th= is > casting such as __toString. > > It is sometimes useful to explicitly state a variable's type. One case is > when the engine might incorrectly infer the type. For example "073117" is= a > valid octal integer but also a date string in mmddyy format, so a > comparison with another date string in the same format could be... amusin= g. > While there is a string comparison function, that functions presence is > borne of the fact that we can't reliably compare "073117" with say, > "010216" because of the int casting possibility. > > Since the scalar types have already been reserved as keywords they can be > used to declare variables in a manner not unlike C or Java. > > int $a =3D 073117; > > The var keyword is still around from PHP 4 days but is going unused. In > JavaScript var is used to formally declare a variable though it isn't > required (It remains important because without it JavaScript will search > the scope chain of the current closure all the way to the top scope. If i= t > doesn't find the reference it only then creates one. This can lead to hug= e > headaches so the creation of variables without using var is strongly > discouraged in JavaScript). > > Since the keyword is available, let's make use of it. > > var $a =3D "123"; > > What I propose this will do is formally declare $a, infer it's type, then > LOCK the type from casting. If further assignments are made to the variab= le > the quantity being assigned will be cast to type desired if possible, > otherwise a type error will be raised. > > var string $a =3D $_POST['date']; > > This syntax allows the programmer to choose the type rather than allowing > the engine to infer it. Here $_POST['date'] might be provided in date > string that might be confused for an octal int. > > This magical casting is suggested because it follows the spirit of PHP, b= ut > it may not be strict enough. For those the type can be explicitly declare= d > without using the var keyword as follows. > > int $a =3D 4; > > In this event a type error will occur on any attempt to assign a value to > $a that isn't an int. > > The variable can still be re-declared in both cases so. > > var $a =3D 4; > string $a =3D "Hello"; > > The var keyword can be combined with the new keyword to lock an object > variable so it doesn't accidentally change > > var $a =3D new SomeClass(); > > As noted above a deliberate redeclare can still change the type of $a. > > If $a is declared with an int type shouldn't it be enought to simply freeze it's type to int? var keyword was used in PHP4 and PHP5 and I suppose no one uses it in PHP7 anymore, why not deprecate it? IMO it shoudl be burned&burried. If all variable declarations with type would lock it's type then var keyword would be useless am I right? > > > ARRAYS > All members of an array can be cast to one type using this syntax > > var string array $a =3D [ 'Mary', 'had', 'a', 'little', 'lamb' ]; > int array $b =3D [1,2,3,5]; > > Personally I really don't like proposed syntax, there are some work in progress in subject of generics and IMO that should be the right way to declare generic types. > Or members can be individually cast > > var $a =3D [ var 'Todd', var 'Alex' ]; > $b =3D [string 'id' =3D> int 1, 'name' =3D> string 'Chad']; > > Again, following rules similar to the above. The main reason for doing > this is to insure smooth interaction with the pack and splat operators. > > function foo (var string array $a =3D ...); > > Here again why not just lock it's type here if we expect $a to be int. I assume if someone declares it as array or string he did it with some purpose. > And speaking of functions, that's the next section. > > > > FUNCTION DECLARATION > > Variables are also declared as arguments to functions. I propose using t= he > var keyword to lock the resulting variable and perform a cast if possible= . > > function foo( var string $a, var $b ) {} > > Note that using var without explicitly calling type will be legal if rare= ly > used for consistency reasons. Also, someone might have a use for an > argument who's type could be anything, but won't change after it is > received. > > The type can also be inferred from the default. > > function foo( var $a =3D "hello" ) {} > > This syntax is essentially doing a redeclare of the variable. This could = be > very troublesome with references, so a Type error will result if this is > tried. > > function foo ( var &$a =3D "Hello" ) {} > > $b =3D 3; > foo($b); > > With objects the var keyword can be used to prevent the function from > changing the object. > > function foo ( var SomeClass $a ) {} > > > > > > CLASS MEMBER DECLARATION > > Variables also appear as object members. Following the pattern establishe= d > above their types can be locked. A couple note though > > class SomeClass { > > var $a =3D '3'; > public var $b =3D 'hello'; > > } > This is awkward, seems like returning to PHP4 again. public is strictly pointing out that $a member is public and I suppose everyone got used to it. > > For backwards compatibility the var keyword by itself must be equivalent = to > "public". It is only when a scope operator is present that var takes on i= ts > new meaning in this context. > > Magic __set and __get cannot access variables with locked types because, > well, it will be a bloody mess. Basic getter/setter behavior (insuring th= e > datatype is correct) is accomplished just with the ability to type lock. > Beyond that explicit getters and setters will be needed, and once again a= n > inbuilt interface will be invoked. The interface is a little magical thou= gh > like the ArrayAccess interface. > > class SomeClass implements AccessorInterface { > > protected $a =3D ''; > protected var $b =3D "string"; > protected int $c =3D 5; > > public get_a () { return $this->a; } > public set_b( $val ) { $this->b =3D (string) $val; } > public get_c():int { return $this->c; } > > } > IMO there are better proposals for getters/setters in PHP RFC's. You rely here on specific function naming which may collide or mess with popular naming conventions. > > Unlike userland interfaces, the AccessorInterface gets its potential meth= od > names from the properties of the members as well as their signatures. > These methods follow the get_[varname] or set_[varname]. Getters must > return the same type as the underlying var if specified. Setters don't ha= ve > to take a matching argument as often their job is conversion. > > > CASTING INTERFACES > > As these elements are introduced the ability of objects to control how th= ey > are cast into scalars needs better improving. I propose interfaces in the > vein of ArrayAccess with the pattern below. > > interface IntegerCastable { > public function CastToInteger():int; > } > > PHP will call the function for the appropriate casting operation. Also, i= f > the object with at least one of these interfaces is echo'ed out then that > cast will be performed, in the priority order string, float, int, bool. > > Note - The magic __toString and the StringCastable interface are mutually > exclusive - trying to create an object with both will trip a parseError. > > > > > COMPARISON BEHAVIOR > Controlling variable types gives us more granular control over comparison= s > but determining which of the variables in a comparison can be coerced. Wh= en > a variable with a locked type is compared to another variable only that > other variable can be coerced. > > Even better, comparisons between strongly typed variables are always stri= ct > and a TypeError results if their types don't match. This actually provide= s > a way to force the greater than, lesser than, etc. to be strict. > > > > > PERFORMANCE IMPLICATIONS - TURNING IT OFF. > I imagine implementing all of the above will incur a performance hit. Yet= , > pretty much all of this could be done with strategic use of assert(). PHP > is happy to not do this checking - much of it is for program testing and > peace of mind. So the last piece of the proposal is to insure all of the > above can be disabled. > > I'm personally in favor if turning it off using the existing zend.asserti= on > flag since all these checks are part of Design by Contract anyway. In > addition to turning off all of the above I recommend allowing other > function type declarations to be turned off by zend.assertion. > > There are reasons not to do that and use a separate flag for one or both = of > these methods. While I can live with that I would like to point out that > debug flag proliferation can lead to confusion. > > The crux of my argument for using the zend.assertion flag is that these > type checks are all, at the end of day, engine level assertions. PHP ship= s > with zend.assertion set to 1, and with PHP 8 we can keep that default and > recommend to providers to not assume it's safe to set it to -1 since ther= e > is a small, but not insignificant, chance that old code relying on Type > declarations to be on might corrupt user data. I admit this would be > painful in the short term, but it is better for the long term health of t= he > language and parser. > > > > > CONCLUSION > I believe that covers all the bases needed. This will give those who want > things to use strong typing better tools, and those who don't can be free > to ignore them. > Please register a wiki account and put proposed RFC to the RFC's list. --=20 regards / pozdrawiam, -- Micha=C5=82 Brzuchalski about.me/brzuchal brzuchalski.com --f4030435b69400cb340561f38dd1--