Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:101530 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 90111 invoked from network); 4 Jan 2018 02:37:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 4 Jan 2018 02:37:27 -0000 Authentication-Results: pb1.pair.com smtp.mail=tendoaki@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=tendoaki@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.161.178 as permitted sender) X-PHP-List-Original-Sender: tendoaki@gmail.com X-Host-Fingerprint: 209.85.161.178 mail-yw0-f178.google.com Received: from [209.85.161.178] ([209.85.161.178:38261] helo=mail-yw0-f178.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 99/A7-23177-6639D4A5 for ; Wed, 03 Jan 2018 21:37:27 -0500 Received: by mail-yw0-f178.google.com with SMTP id x199so121133ywg.5 for ; Wed, 03 Jan 2018 18:37:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=kESD0FpC4S/NJ+2QI0mF22qFEezQiB4yrd+svRkA0II=; b=kpJ09DOh4UpuVIfiMbr3Q4zvj2e2xLXtRuZkGgsVxfDpf/FW2zhi+DI30iZtk1QdhT QqAkSCuATFDEgU9rSpAiOpiDxXN6NmP85TzqaR4nbVTT3XUnBfstKGKDiGeSXi9aFfHP /BUiBdvBKakjx6wOw45ZisO1hD4P83ngjNQQ5NZcAQz3zlDDo2Zn3wd8SFpcf5mTgkmD xSmv6wzyz2SmdJ9SyVkAf5TNQJCFTLO3Xi/fn/mBl++mVZLMS1Ajv9cy006/yrDHMfxy GV6xd3JUw5En4hGzNhPDjjBwxNl8eFyhl5owKDf41OJVWgYs+g1YUPRXcPNrjp3Nmqiz 7tWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=kESD0FpC4S/NJ+2QI0mF22qFEezQiB4yrd+svRkA0II=; b=h/bwl2OtO0c8z0tSV83TJUTRvQzefekvBqyLHaW5w28kg/6L36B473HPaPlx5ExQe1 6espvUfYWgAjMURE2zTCkojlbd0v9c6Lle9AK3xL9v3MyPTsEvq5FlRn6ZMdO2r225w7 zsleCd6hYRPevd7Wq9iaNAVwy5AwuHS5JaJ0nB9087jnLkKybX0hcrUt5iZG+3x0Ihp1 cjzEP6eD4GHfcMY3T2boBl2hHfvEjnbQg1PTX2bEw6ibgZFQu42rbk7mKqU6w7eTbedD TBV1bQDh+MFa5aKlx8Jo06L8OSLz1rFquMa/u5Br7PwmiF5tBoKQTQ253y8yo/5X/XOV A5jw== X-Gm-Message-State: AKGB3mIkoY+C3PtydnOUB11yOQU4HLzyQBXv+x9xPNnRv8TXzA7J8ES+ NoS06AjJxqFKm9WiobqTvH+zRAQ9HcTSAkcnoiesmg== X-Google-Smtp-Source: ACJfBotlJXpQC6u/4tQWMKTLWBPeDTzc7ezuF1p3lHn36ODWg1wnKBWJQtTQNQsxwgA7GEj4yPlXLlYQbhCOkesE/t4= X-Received: by 10.129.128.69 with SMTP id q66mr3068884ywf.81.1515033443382; Wed, 03 Jan 2018 18:37:23 -0800 (PST) MIME-Version: 1.0 Received: by 10.37.201.7 with HTTP; Wed, 3 Jan 2018 18:37:22 -0800 (PST) In-Reply-To: References: Date: Wed, 3 Jan 2018 21:37:22 -0500 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary="94eb2c030b6437487a0561ea3626" Subject: Re: [PHP-DEV][RFC][DISCUSSION] Strong Typing Syntax From: tendoaki@gmail.com (Michael Morris) --94eb2c030b6437487a0561ea3626 Content-Type: text/plain; charset="UTF-8" Second Draft based on the feedback upstream. Target version: PHP 8. This is a proposal to strengthen the dynamic type checking of PHP during development. Note - this is not a proposal to change PHP to a statically typed language or to remove PHP's current loose typing rules. PHP is a weakly typed language for a reason, and will remain so subsequent to this RFC. This RFC is concerned with providing tools to make controlling variable types stronger when the programmer deems this necessary. VARIABLE DECLARATION PHP currently has no keyword to initialize a variable - it is simply created when it is first referenced. The engine infers the appropriate type for the variable, and this may be later cast to other types depending on the context of the code. Objects can have magic functions to carry out this casting such as __toString. It is sometimes useful to explicitly state a variable's type. One case is when the engine might incorrectly infer the type. For example "073117" is a valid octal integer but also a date string in mmddyy format, so a comparison with another date string in the same format could be... amusing. While there is a string comparison function, that functions presence is borne of the fact that we can't reliably compare "073117" with say, "010216" because of the int casting possibility. Since the scalar types have already been reserved as keywords they can be used to declare variables in a manner not unlike C or Java. int $a = 073117; The var keyword is still around from PHP 4 days but is going unused. In JavaScript var is used to formally declare a variable though it isn't required (It remains important because without it JavaScript will search the scope chain of the current closure all the way to the top scope. If it doesn't find the reference it only then creates one. This can lead to huge headaches so the creation of variables without using var is strongly discouraged in JavaScript). Since the keyword is available, let's make use of it. var $a = "123"; What I propose this will do is formally declare $a, infer it's type, then LOCK the type from casting. If further assignments are made to the variable the quantity being assigned will be cast to type desired if possible, otherwise a type error will be raised. var string $a = $_POST['date']; This syntax allows the programmer to choose the type rather than allowing the engine to infer it. Here $_POST['date'] might be provided in date string that might be confused for an octal int. This magical casting is suggested because it follows the spirit of PHP, but it may not be strict enough. For those the type can be explicitly declared without using the var keyword as follows. int $a = 4; In this event a type error will occur on any attempt to assign a value to $a that isn't an int. The variable can still be re-declared in both cases so. var $a = 4; string $a = "Hello"; The var keyword can be combined with the new keyword to lock an object variable so it doesn't accidentally change var $a = new SomeClass(); As noted above a deliberate redeclare can still change the type of $a. ARRAYS All members of an array can be cast to one type using this syntax var string array $a = [ 'Mary', 'had', 'a', 'little', 'lamb' ]; int array $b = [1,2,3,5]; Or members can be individually cast var $a = [ var 'Todd', var 'Alex' ]; $b = [string 'id' => int 1, 'name' => string 'Chad']; Again, following rules similar to the above. The main reason for doing this is to insure smooth interaction with the pack and splat operators. function foo (var string array $a = ...); And speaking of functions, that's the next section. FUNCTION DECLARATION Variables are also declared as arguments to functions. I propose using the var keyword to lock the resulting variable and perform a cast if possible. function foo( var string $a, var $b ) {} Note that using var without explicitly calling type will be legal if rarely used for consistency reasons. Also, someone might have a use for an argument who's type could be anything, but won't change after it is received. The type can also be inferred from the default. function foo( var $a = "hello" ) {} This syntax is essentially doing a redeclare of the variable. This could be very troublesome with references, so a Type error will result if this is tried. function foo ( var &$a = "Hello" ) {} $b = 3; foo($b); With objects the var keyword can be used to prevent the function from changing the object. function foo ( var SomeClass $a ) {} CLASS MEMBER DECLARATION Variables also appear as object members. Following the pattern established above their types can be locked. A couple note though class SomeClass { var $a = '3'; public var $b = 'hello'; } For backwards compatibility the var keyword by itself must be equivalent to "public". It is only when a scope operator is present that var takes on its new meaning in this context. Magic __set and __get cannot access variables with locked types because, well, it will be a bloody mess. Basic getter/setter behavior (insuring the datatype is correct) is accomplished just with the ability to type lock. Beyond that explicit getters and setters will be needed, and once again an inbuilt interface will be invoked. The interface is a little magical though like the ArrayAccess interface. class SomeClass implements AccessorInterface { protected $a = ''; protected var $b = "string"; protected int $c = 5; public get_a () { return $this->a; } public set_b( $val ) { $this->b = (string) $val; } public get_c():int { return $this->c; } } Unlike userland interfaces, the AccessorInterface gets its potential method names from the properties of the members as well as their signatures. These methods follow the get_[varname] or set_[varname]. Getters must return the same type as the underlying var if specified. Setters don't have to take a matching argument as often their job is conversion. CASTING INTERFACES As these elements are introduced the ability of objects to control how they are cast into scalars needs better improving. I propose interfaces in the vein of ArrayAccess with the pattern below. interface IntegerCastable { public function CastToInteger():int; } PHP will call the function for the appropriate casting operation. Also, if the object with at least one of these interfaces is echo'ed out then that cast will be performed, in the priority order string, float, int, bool. Note - The magic __toString and the StringCastable interface are mutually exclusive - trying to create an object with both will trip a parseError. COMPARISON BEHAVIOR Controlling variable types gives us more granular control over comparisons but determining which of the variables in a comparison can be coerced. When a variable with a locked type is compared to another variable only that other variable can be coerced. Even better, comparisons between strongly typed variables are always strict and a TypeError results if their types don't match. This actually provides a way to force the greater than, lesser than, etc. to be strict. PERFORMANCE IMPLICATIONS - TURNING IT OFF. I imagine implementing all of the above will incur a performance hit. Yet, pretty much all of this could be done with strategic use of assert(). PHP is happy to not do this checking - much of it is for program testing and peace of mind. So the last piece of the proposal is to insure all of the above can be disabled. I'm personally in favor if turning it off using the existing zend.assertion flag since all these checks are part of Design by Contract anyway. In addition to turning off all of the above I recommend allowing other function type declarations to be turned off by zend.assertion. There are reasons not to do that and use a separate flag for one or both of these methods. While I can live with that I would like to point out that debug flag proliferation can lead to confusion. The crux of my argument for using the zend.assertion flag is that these type checks are all, at the end of day, engine level assertions. PHP ships with zend.assertion set to 1, and with PHP 8 we can keep that default and recommend to providers to not assume it's safe to set it to -1 since there is a small, but not insignificant, chance that old code relying on Type declarations to be on might corrupt user data. I admit this would be painful in the short term, but it is better for the long term health of the language and parser. CONCLUSION I believe that covers all the bases needed. This will give those who want things to use strong typing better tools, and those who don't can be free to ignore them. --94eb2c030b6437487a0561ea3626--