Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:83479
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: pass (pb1.pair.com: domain zend.com designates 209.85.213.169 as permitted sender)
MIME-Version: 1.0
Thread-Index: AdBOwQwTdn6RmjTARXy1tOHX8uz1OQ==
Date: Sun, 22 Feb 2015 20:37:06 +0200
Message-ID: <2e4694f9805ee81ea0b2c79eab06c2d6@mail.gmail.com>
To: Jefferson Gonzalez <jgmdev@gmail.com>
Cc: PHP internals <internals@lists.php.net>
Content-Type: text/plain; charset=UTF-8
Subject: JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)
From: zeev@zend.com (Zeev Suraski)

> -----Original Message-----
> From: Jefferson Gonzalez [mailto:jgmdev@gmail.com]
> Sent: Sunday, February 22, 2015 4:25 PM
> To: Etienne Kneuss; Anthony Ferrara; Zeev Suraski
> Cc: PHP internals
> Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC
>

Jefferson,

Please note that Anthony, the lead of the Dual Mode RFC, said this earlier
on this thread, referring to the claim that Strict STH can improve JIT/AOT
compilation:

"A statement which I said on the side, and I said should not impact RFC or
voting in any way. And is in no part in my RFC at all."

Please also see:

marc.info/?l=php-internals&m=142439750614527&w=2

So while Anthony and I don't agree on whether there are performance gains
to be had from Strict STH, both of us agree that it's not at a level that
should influence our decision regarding the RFCs on the table.

I wholeheartedly agree with that stance, which is why I also listed the
apparently extremely widespread misconception (IMHO) that Strict STH can
meaningfully help JIT/AOT in my RFC.

Despite that, as your email suggests, there are still (presumably a lot)
of people out there that assume that there are, in fact, substantial gains
to be had from JIT/AOT if we introduce Strict STH.  I'm going to take
another stab at explaining why that's not the case.

> A JIT or AOT machine code generator IMHO will never have a decent use of
> system resources without some sort of strong/strict typed rules,
somebody
> explain if thats not the case.

It kind of is and kind of isn't.

There's consensus, I think, that if PHP was completely strongly typed -
i.e., all variables need to be declared and typed ahead of time, cannot
change types, etc. - we'd clearly be able to create a lot of optimizations
in AOT that we can't do today.  That the part that 'is the case'.  But
nobody is suggesting that we do that.  The discussion on the table is
very, very narrow:

-- Can the code generated for a strict type hint can somehow be optimized
significantly better than the code generated for a dynamic/coercive type
hint.

And here, I (as well as Dmitry, who actually wrote a JIT compiler for PHP)
claim that it isn't the case.  To be fair, there's no consensus on this
point.

Let me attempt, again, to explain why we don't believe there are any gains
associated with Strict STH, be them with the regular engine, JIT or AOT.

Consider the following code snippet:

function strict_foo($x)
{
  if (!is_int($x)) {
    trigger_error();
  }
   .inner_code.
}

function very_lax_foo($x)
{
    $x = (int) $x;
   .inner_code.
}

function test_strict()
{
  .outer_code.
  strict_foo($x);
}

function test_lax()
{
  .outer_code.
  very_lax_foo($x);
}

test_strict();
test_lax();


strict_foo() implements a pretty much identical check to the one that a
Strict integer STH would perform.
very_lax_foo() implements an explicit type conversion to int, that can
pretty much never fail - which is significantly more lax than what is
proposed for weak types in the Dual Mode RFC, and even more so compared to
the Coercive STH RFC.
.inner_code. is identical between the two foo() functions, and
.outer_code. is identical between the two tester functions.

The claim that strict types can be more efficiently optimized than more
lax types, suggests it should be possible to optimize the code flow for
test_strict()/strict_foo() significantly better than for very_lax_foo()
using JIT/AOT.

Let's dive in.  Beginning with the easy part, that's been mentioned
countless times - it's clear that it's just as easy (or hard) to optimize
the .inner_code. block in the two implementations of foo().  It can bank
on the exact same assumptions - $x would be an integer.  So we can
optimize the two function bases to exactly the same level.  For example,
if we're sure that $x inside the function never changes type - it can be
optimized down to a C-level long.  That's oversimplifying things a bit,
but the important thing here is that it can be easily proven that the two
function bodies can be optimized to the exact same level, for better or
worse.  The only difference between them is how they handle non-integer
inputs;  The strict implementation errors out if it gets a non-integer
typed value, while the lax version happily accepts everything.  But that's
a functionality difference, not a performance one (i.e., if you want the
value to be accepted in the strict case, you'd manually conduct the
conversion before the call is made, or sooner - resulting in roughly the
same behavior and performance).

Now the slightly trickier part - the .outer_code. block.  What can we say
about the type of $x, without knowing what code is in there?  Not a whole
lot.  We know that if $x isn't going to be of type int at the end of this
block, test_strict() is going to fail, but that doesn't mean $x will truly
be an int.  The fact I want to be young and healthy doesn't mean I'm going
to magically become young and healthy :)

Let's dive further.  Assuming we don't have strong variable type
declarations (i.e., int $x;  $x = "foo"; // fails!), there are two
possible outcomes from analyzing the .outer_code. block:

1. We can infer (with varying levels of confidence) that $x is going to be
an int right before the call to foo() is made.  Whether we can infer that
or not has nothing to do with any implementation detail of foo(),
including whether it's using Strict type hints, Weak type hints, or
conducts nuclear simulations.  It has only to do with the code in
.outer_code., which means we can do it equally well (or not so well) in
both test_strict() and test_lax().  Before anybody asks, the levels of
confidence would also not vary between the two flavors, and it too, has
only to do with what's written in the .outer_code. block.

2. We cannot determine what type $x is going to be right before the call
to foo() is made.  Here too, our ability or inability to determine that is
identical between test_strict() and test_lax(), and has only to do with
our ability to analyze .outer_block., nothing else.

Now, let's continue to dive into the first scenario, as the second one can
obviously not be optimized in any meaningful way.

To simplify things, let's assume we can know - with absolute confidence,
that $x is an int right before the call.  Can we somehow optimize
test_strict()/strict_foo() better than test_lax()/very_lax_foo()?  The
answer is - no, not really.  We could optimize them down to the exact same
machine level code - bypassing the is_strict() check in the strict_foo()
case, and the explicit cast in very_lax_foo() case.  With absolute
confidence that $x is an int, we could have a single long pass the
caller-callee boundary in both cases.  This is easier said than done -
JIT/AOT are quite complex - but it's equally hard in both cases, and the
end result is identical.

> As I see it, some example, if the JIT generated C++ code to then
generate the
> machine code:
>
> function calc(int $val1, int $val2) : int {return $val1 + $val2;}

How does that handle the situation where $val1 is a float?  Or a string?
Or an array?
Here you are already assuming that you KNOW $val1 and $val2 are ints, but
nothing about declaring $val1/$val2 as as 'strict int' implies anything
regarding the value/type that will actually be passed to the function.

> On weak mode I see the generated code would be something like this:
>
> Variant* calc(Variant& val1, Variant& val2) {
>      if(val1.isInt() && val2.isInt())
>          return new Variant(val1.toInt() + val2.toInt());
>
>      else if(val1.isFloat() && val2.isFloat())
>          return new Variant(val1.toInt() + val2.toInt());
>      else
>          throw new RuntimeError();
> }

Technically it'd be more like this:

Variant* calc(Variant& val1, Variant& val2) {
   if(val1.isInt() ) {
        // type checking
        if (!val1.coerceToInt()) {
          throw new RuntimeError()
        }
        If (!val2.coerceToInt()) {
          throw new RuntimeError();
        }

        // function body begins here
        int result = Variant(val1.intValue() + val2.intValue());
        return result;
}

But the code that's generated for strict typing would actually not look
very different, if it can't assume that val1 & val2 are ints.  A generic
implementation that has no insight about the incoming values would look
almost identical:

Variant* calc(Variant& val1, Variant& val2) {
   if(val1.isInt() ) {
        // type checking
        if (!val1.isInt()) {
          throw new RuntimeError()
        }
        If (!val2.isInt()) {
          throw new RuntimeError();
        }

        // function body begins here
        int result = Variant(val1.intValue() + val2.intValue());
        return result;
}

Between both cases, the actual code body (after the 'function body begins
here' comment), can be optimized in exactly the same way, and probably all
the way down to this:

{
  long val1, val2;

  return val1+val2;
}

In the strict case, if we somehow know ahead of time or just in time that
val1.isInt() and val2.isInt() are true (for a particular instance of a
call to calc()) - we can create a more efficient function calling code
that will do away with all of type checking, and go directly to the
function body.

But is the Dynamic version any different?  Not at all.
In fact, if val1.isInt() and val2.isInt() are true, we can equally bypass
the coerceToInt() calls in the dynamic version - as we already know it's
an int!  What are we left with?  The exact same code, down to the last
bit.

To summarize, the difference between the different flavors of STH simply
has no meaningful impact on the performance of the current non-JITted PHP
as well as potential future JIT/AOT engines.  If you can infer the caller
argument types - you can reach the same level of performance in all the
different types of STH.  If you can't infer the types - the code would
look remarkably similar between the two, with the big difference being
behavioral - not performance.  All types of STH give you the exact same
input for optimizing the callee, down to the last bit.

> 1. Does weak mode could provide the required rules to implement a JIT
with
> a sane level of memory and CPU usage?

It would use the exact same amount of memory and CPU as strict mode.  The
hard part remains inferring types - and as I hoped I illustrated, there's
no different between different flavors of STH in that front.

> 2. I see that the proponents of dual weak/strict modes are offering to
write a
> AOT implementation if strict makes it, And the impresive work of Joe
(JITFU)
> and Anthony on recki-ct with the strict mode could be taken to another
level
> of integration with PHP and performance. IMHO is harder and more
resource
> hungry to implement a JIT/AOT using weak mode. With that said, if a JIT
> implementation is developed will the story of the ZendOptimizer being a
> commercial solution will be repeated or would this JIT implementation
would
> be part of the core?

Actually, what I'm seeing is the proponents of dual weak/strict mode
saying that JIT/AOT should not be a part of the discussion around the
RFCs, and people refusing to accept that! :)
But more importantly, as early as when we announced PHPNG, we said we'd
want to look into JIT solutions once it ships.  We'd obviously want to
cooperate with everyone interested to try and create the best JIT
implementation possible once PHP 7 is out the door, as a part of an open
effort.  If it's ever any good, it'll make it into the core, if accepted.

> Thats all that comes to mind now, and while many people doesn't care for
> performance, IMHO a programming language mainly targeted for the web
> should have some caring on this department.

I think it's fair to say that Dmitry - who led the PHPNG effort - cares *a
lot* performance.  I'm sure you'd agree.  I tend to think that I also care
a lot about performance, and so does Xinchen.  We all spent substantial
parts of our lives working to speed PHP up.  It's not whether we think
performance is important - it is (although we do believe we should build
optimizers for languages, more so than languages for optimizers).  It's
just that we all fail to see how the flavor of STH can have any meaningful
influence on performance.

Thanks for the feedback!

Zeev