Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:83574 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 67952 invoked from network); 23 Feb 2015 14:28:32 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Feb 2015 14:28:32 -0000 Authentication-Results: pb1.pair.com smtp.mail=ircmaxell@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=ircmaxell@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.217.170 as permitted sender) X-PHP-List-Original-Sender: ircmaxell@gmail.com X-Host-Fingerprint: 209.85.217.170 mail-lb0-f170.google.com Received: from [209.85.217.170] ([209.85.217.170:34663] helo=mail-lb0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 39/C1-01128-E093BE45 for ; Mon, 23 Feb 2015 09:28:31 -0500 Received: by lbdu14 with SMTP id u14so18680084lbd.1 for ; Mon, 23 Feb 2015 06:28:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=dMtUsaduS8EGx/icqtZhK5KVYmf7g6PYPSGu8OzOJQw=; b=qYw4VQQ3MJXJO84G1JlGWe4Npc1qoIPJes4cS928zBgNuwpz9nxYY1df6BVbO3sK8u zBt9bktcwh7xO0ReP3J9wRq9wuLAnXikrOSPwLbUZIXzwYF8oZ75cPgvfV4thBDyHTfM MolT5GpyW+1bp+pX3VW3Gaxj5rlHxNzErBHIZ9guEXV/q5TQ+y8BHTVQdLvkFKRtmBgK eKQvRASWQAM4Y42VATgrQoAxyumuaSIQOvSPSf3Qv4/WuEYpGMaFQu9SDyP2QobLxbdu teBxaJ/+RSUe9HwKaGyqA8wUCfVuRdTtOwBPOBvhmpDw7QXlG+6giZPjVjbj3oVCzoCq sXDw== MIME-Version: 1.0 X-Received: by 10.112.155.168 with SMTP id vx8mr10059076lbb.110.1424701707334; Mon, 23 Feb 2015 06:28:27 -0800 (PST) Received: by 10.25.43.9 with HTTP; Mon, 23 Feb 2015 06:28:27 -0800 (PST) In-Reply-To: <005801d04f73$2c59a450$850cecf0$@tutteli.ch> References: <2e4694f9805ee81ea0b2c79eab06c2d6@mail.gmail.com> <54EA5EDA.8010605@gmail.com> <54EA6A99.5010609@gmail.com> <54EA7F15.9030606@gmail.com> <54EA891B.6030405@gmail.com> <09b9ee836c04b1750614a91bd39a5bed@mail.gmail.com> <54EA97A2.4010701@gmail.com> <003901d04f42$8ec315d0$ac494170$@tutteli.ch> <005801d04f73$2c59a450$850cecf0$@tutteli.ch> Date: Mon, 23 Feb 2015 09:28:27 -0500 Message-ID: To: Robert Stoll Cc: PHP internals Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC) From: ircmaxell@gmail.com (Anthony Ferrara) Robert, > [Robert Stoll] > Sure, "a" was just an example to illustrate the problem. I figured it wou= ld not be necessary to say that the value of $b can be completely unknown b= y the static analyser -> could come from user input, from a database, from = unserialising code etc. (but probably that is what you meant with "this isn= 't the general case" below). > > Assuming statically that $a is int or $b is string is erroneous in this c= ontext. > > Another problem to illustrate that a top type or at least some form of un= ion type is required: > > function foo($x, $y, $z){ > $a =3D 1; > if($x){ > $a =3D "1"; > } > If($y > 10){ > $a =3D []; > } > If($z->foo() < 100){ > $a =3D new Exception(); > } > echo $a; > return $a; > } > > How do you want to type $a without using a union type? Actually, this case is reasonably easy to handle. There's a representation called SSA (Static-Single-Assignment) that you move code to prior to doing type analysis. Basically, at a really high level, it would rewrite the code to this: function foo($x, $y, $z){ $a =3D 1; if($x){ $a1 =3D "1"; } $a2 =3D =CE=A6($a, $a1); If($y > 10){ $a3 =3D []; } $a4 =3D =CE=A6($a2, $a3); If($z->foo() < 100){ $a5 =3D new Exception(); } $a6 =3D =CE=A6($a4, $a5); echo $a6; return $a6; } Where =CE=A6 is a function that chooses the value based on the branch of the graph that entered it. There are a few ways to implement it in practice, one would be to generate a variant. But another would be to generate different code paths. Considering that $a5 will be the result if $z->foo() < 1000 no matter what the prior conditionals are, you could invert the code to push that check first, making it: function foo($x, $y, $z) { if ($z->foo() < 1000) { $a =3D new Exception(); echo $a; return $a; } if ($y > 10) { echo []; return []; } if ($x) { echo "1"; return "1"; } echo 1; return 1; } That transform can be done by the compiler, and hence never need you to do anything. We still compiled without variants, and the analysis job wasn't that difficult. There will be cases of course where this won't work. And in those cases we could either not compile, or generate a variant. However, I would like to point out something. If you added a return type, and ran that code in strict mode, it would error. A static analyzer can pick up that error and tell you about it. So really, we're not talking about valid strict code here (tho the same problem does exist inside strict bodies, and techniques can be done here the same. For more info, check out: https://github.com/google/recki-ct/blob/master/doc/5_phi_resolving.md >> >> And hence know **at compile time** that's an error. >> >> This isn't the general case, but we can error in that case (from a stati= c analysis perspective at least) and say "this code is too >> dynamic". In strict mode at least. > > [Robert Stoll] > If you go and implement a more conservative type system than the actual d= ynamic type system of PHP well then... you can do whatever you want of cour= se. > But if you do not just want to support just a limited set of PHP then you= will need to include dynamic checks in many places. Or do you think that i= s not true? I think with strict type declarations, the 'limitations' are far less than you'd think. Yes, there will be cases (like variable variables, etc) where valid strict code won't be analyzable. But I haven't seen var-vars in the wild in a while. So I think my assertion is fair: the majority of *valid* strict-typed code will be analyzable. Where the majority of coercive won't be. Anthony