Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:83469 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 90734 invoked from network); 22 Feb 2015 13:00:13 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 22 Feb 2015 13:00:13 -0000 Authentication-Results: pb1.pair.com smtp.mail=ekneuss@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=colder@php.net; sender-id=unknown Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.215.44 as permitted sender) X-PHP-List-Original-Sender: ekneuss@gmail.com X-Host-Fingerprint: 209.85.215.44 mail-la0-f44.google.com Received: from [209.85.215.44] ([209.85.215.44:42864] helo=mail-la0-f44.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id D6/32-09723-9D2D9E45 for ; Sun, 22 Feb 2015 08:00:10 -0500 Received: by labgf13 with SMTP id gf13so13835010lab.9 for ; Sun, 22 Feb 2015 05:00:06 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=mime-version:references:from:date:message-id:subject:to:cc :content-type; bh=rzvXPUwyLO4hx85k8sQ/clWrIOKDUwQrcZiKdVp9PjI=; b=N+uG/qIRd1n7Uud3KlLagjWLK5536ugxv3LeVxqLmo+hRIvBb+GmnmkHCyqXerc91i sUJ0ucpk0ng4owtdbYvnxjbIb47PsajS/VM4mVFc0yDZjZSH9LTObOWuAOI4W2SuHicg LGJ8NePbvzfnBx0SfAfuk8q9rypTiGXVHlm33WGEHIkaYIovgEF+ZXq5z35OEU07jkic p8WIq9tumuf+AXqnkZXEc5j3BvMhed7wpchsGxHDUMWcQp30WF8JG4ivuk6I3J4w3NPt LpdGOk23Ly15gZTPfnod6N6bfEZUQiTz6djuFa0XsSkEjZezlJK6lOacvMuFF9KlRYr5 uKyg== X-Received: by 10.112.182.69 with SMTP id ec5mr5455455lbc.118.1424610006872; Sun, 22 Feb 2015 05:00:06 -0800 (PST) MIME-Version: 1.0 References: <7ef509ef10bb345c792f9d259c7a3fbb@mail.gmail.com> <8250289916f5128b5bc1a114428d374e@mail.gmail.com> Date: Sun, 22 Feb 2015 13:00:06 +0000 Message-ID: To: Anthony Ferrara , Zeev Suraski Cc: PHP internals Content-Type: multipart/alternative; boundary=001a11c3686e6739bf050facdd1c Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC From: colder@php.net (Etienne Kneuss) --001a11c3686e6739bf050facdd1c Content-Type: text/plain; charset=UTF-8 On Sat Feb 21 2015 at 21:08:39 Anthony Ferrara wrote: > Zeev, > > I won't nit-pick every point, but there are a few I think need to be > clarified. > > >> > Proponents of Dynamic STH bring up consistency with the rest of the > >> language, including some fundamental type-juggling aspects that have > been > >> key tenets of PHP since its inception. Strict STH, in their view, is > >> inconsistent > >> with these tenets. > >> > >> Dynamic STH is apparently consistency with the rest of the language's > >> treatment of scalar types. It's inconsistent with the rest of the > >> languages > >> treatment of parameters. > > > > Not in the way Andrea proposed it, IIRC. She opted to go for consistency > > with internal functions. Either way, at the risk of being shot for > talking > > about spiritual things, Dynamic STH is consistent with the dynamic > spirit of > > PHP, even if there are some discrepancies between its rule-set and the > > implicit typing rules that govern expressions. Note that in this RFC I'm > > actually suggesting a possible way forward that will align *all* aspects > of > > PHP, including implicit casting - and have them all governed by a single > set > > of rules. > > The point I was making up to there is that we currently have 2 type > systems: user-land object and ZPP-scalar. So in any given function you > have 2 type systems interacting. The current ZPP scalar type is > dynamic, and user-land object static. > > With the proposal here, you'd unify user-land scalar to behave as > zpp-scalar. So you'd have two type systems in any given function: > scalar and object (which behave differently). > > My proposal gives you the same two by default (scalar and object) and > a strict switch to collapse them into a single, unified type system. > > This is even more apparent with the int-float acceptance, because we > can mentally model Float as an object that extends Int. Then it makes > perfect sense why you'd accept ints where you see floats, but not the > opposite. > > >> However there's an important point to make here: a lot of best practice > >> has > >> been pushing against the way PHP treats scalar types in certain cases. > >> Specifically around == vs === and using strict comparison mode in > >> in_array, > >> etc. > > > > I think you're correct on comparisons, but not so much on the rest. > Dynamic > > use of scalars in expressions is still exceptionally common in PHP code. > > Even with comparisons, == is still very common - and you'd use == vs. === > > depending on what you need. > > > >> So while it appears consistent with the rest of PHP, it only does so if > >> you > >> ignore a large part of both the language and the way it's commonly used. > > > > Let's agree to disagree. That's one thing we can always agree on! :) > > I'm talking about the object system. I don't think you're disagreeing > that it's static. Hence coercive scalars are consistent only if you > look at 1/2 the type system. That was the point I was making there. > > >> 3. "Just Do It but give users an option to not" - This has the problems > >> that > >> E_DEPRECATED has, but it also gets us back to having fundamental code > >> behavior controlled by an INI setting, which for a very long time this > >> community has generally seen as a bad thing (especially for portability > >> and > >> code re-use). > > > > I do too, and I was upfront about their cons, not just pros. And yet, > they > > all bring us to a much better outcome within a relatively short period of > > time (in the lifetime of a language) than the Dual Mode will. > > Let's agree to disagree that an ini setting will be better than a > per-file setting. > > In fact, I personally think this is major enough of an issue that I > will vote no simply on this reason alone (type behavior depending on > an ini setting in any way shape or form). > > >> > Further, the two sets can cause the same functions to behave > >> > differently depending on where they're being called > >> > >> I think that's misleading. The functions will always behave the same. > >> The difference is how you get data into the function. The behavior > >> difference > >> is in your code, not the end function. > > > > I'll be happy to get a suggestion from you on how to reword that. > > Ultimately, from the layman user's point of view, she'd be calling foo() > > from one place and have it accept her arguments, and foo() from another > > place and have it reject the very same arguments. > > Let me think on it and I will come up with something. > > >> With strict mode, you'd have to embed a cast (smart or explicit) to > >> convert to > >> an integer at the point the data comes in. > > > > First, I'm not aware of smart/safe casts being available or proposed at > this > > point. > > Secondly, why at the point the data comes in? That would be ideal for > > static analyzers, but it's probably a lot more common that it will be > done > > at the first point in time where it gets rejected. > > By "smart cast" I was referring to a function which checked > is_numeric(). Not a new language construct. > > > I have a hard time connecting to the 'power' approach. I think > developers > > want their code to work, with minimal effort, and be secure. Coercive > > scalar type hints will do an excellent job at that. Strict type hints > will > > be more work, are bound to a lot of trigger "Oh come on" responses, and > as a > > special bonus - proliferate the use of explicit casts. Let me top that - > > you'd have developers who think they're security conscious, because > they're > > using strict mode - with code that's full of explicit casts. > > I agree we should have users avoid explicit casts. That's why the > dual-mode proposal exists. If users don't want to control their types, > they should use the default mode. And everything works fine. > > If they know what they want, then the explicit cast becomes a > documenting piece of information that "this is supposed to happen". > Ex: > > function takesInt(int $a) {} > > function foo(float $b) { > return takesInt($b); > } > > In weak mode, that "just works". But is it supposed to just work? You > have no idea. The next developer who comes will look at it and ask "is > that supposed to truncate, or was that an oversight?" and have no > idea. But in strict mode, placing an explicit cast before $b shows the > next developer who comes there "the truncation was intentional". > > >> > Static Analysis. It is the position of several Strict STH proponents > >> > that Strict STH can help static analysis in certain cases. For the > >> > same reasons mentioned above about JIT, we don't believe that is the > >> > case > >> > >> This is patently false. > > > > It's actually patently true. We don't believe that is the case. QED. > > To understand why "we don't believe" can be false, let's make an > analogy: I can say that I don't believe in gravity. That doesn't mean > that the opinion isn't patently false just because it was stated as an > opinion (or rather the "believe" is true, but the implication of the > belief is false)... > > > > >> Keep not believing it all you want, but *static analysis* > >> requires statically looking at code. Which means you have no value > >> information. So static analysis can't possibly happen in cases where you > >> need > >> to know about value information (because it's not there). Yes, at > function > >> entry you know the types. But static analysis isn't about analyzing a > >> single > >> function (in fact, that's the least interesting case). It's more about > >> analyzing a > >> series of functions, a function call graph. And in that case strict > typing > >> (based > >> only on > >> type) does make a big difference. > > > > I think it's fair to say that while we were unable to convince you > there's > > no tangible extra value in Strict STH compared to any other kind of STH > that > > guarantees the type of value a function will get, you were also unable to > > convince Dmitry, Stas or myself - all of which independently discussed it > > with you. Again, despite that, I'm not saying that you're "patently > wrong", > > just that I don't believe you're right. > > I've built a static analyzer that's public. I've talked to people who > build them for a living. I don't claim to be an expert in them (far > from it), but what I've seen and learned is that what you're talking > about here either isn't possible (yet) or is difficult enough to be > impractical (in terms of computing resources necessary). > > You can disagree with me all you want. You don't even need to convince > me. All you need to do is disprove me. Show me a static analyzer for a > sufficiently dynamic language (Scalar PHP or full JS - not ASM.js - > would work) and I'll happy apologize and retract the comment. But so > far all I've seen are people saying it's possible even in presence of > arguments to the contrary (why it's not possible). > > There have been several attempts: for JS: http://users-cs.au.dk/simonhj/tajs2009.pdf or similar techniques applied to PHP, quite outdated though: https://github.com/colder/phantm You are right that the lack of static information about types is (one of the) a main issue. Recovering the types has typically a huge performance cost, or is unreliable But seriously, time is getting wasted on this argument; it's actually a no-brainer: more static information helps tools that rely on static information. Yes. Absolutely. 100%. The question is rather: at what weight should we take (potential/future) external tools into account when developping language features? --001a11c3686e6739bf050facdd1c--