Has anyone tried this or know of anyone who is interested in
implementing this for the Zend Engine? I tried searching the archives
and didn't find anything on this topic. (Would Google help? No, only
turns up some commercial PHP compiler for PHP 4.x.
I believe it is possible currently using the Zend Engine and working it
either on top of APC or in place of APC. It would quite possibly help if
I ventured further into the Zend Engine and looked at APC source.
Researching the topic has bought forth a very complex subject matter,
which I suppose is one reason why it hasn't been implemented yet. It is
easier (yes?) to compile to opcode and then interpret that or compile
directly to machine code than building a JIT for all known CPU
architectures (there goes two long years! More if I try to implement it
and I do plan on trying... and will fail at it, but it should be
interesting and fun to say the least).
The reason why I ask is after looking at speed comparsions, PHP does
appear to fall behind even Ruby and Python. It is becoming difficult to
justify continuing coding using PHP based on what would appear to be
objective speed results. They perhaps, might not of used the APC or
optimizer in the speed comparisons.
Discussions with my teacher on the subject matter further proved my
assertion that PHP would be better served with JIT compiler than APC
(Sorry! Sad but true). I will try to justify my statement and let more
intelligent people of this mailing list beat me down, if the case is
that I'm wrong.
Native Machine code will always be quicker than interpreting Opcode (I
would so much assume that the PHP engine interprets and takes action
upon the passed Opcode to the engine using APC). The reason from my
research is that, well, you are passing the opcode through a layer
before hitting the CPU whereas the machine code can pass directly to the
CPU. Also machine code does not need to be interpreted by the machine
and saves from that overhead.
It is possible to keep the PHP engine in control, while still running
the JIT compilation. Little fuzzy on exactly how this would work. Would
the compiled PHP script call PHP Engine, or would the PHP Engine call
the compiled PHP script, or keeping it all in memory and somehow
combining the two? Assembler seems quite fascinating, as well as
learning other tidbits about compiling and languages I did not know before.
Two possible open source projects that would speed the process up
considerably are GNU Lightning (http://www.gnu.org/software/lightning/)
and YASM, the library not the compiler
(http://www.tortall.net/projects/yasm/wiki). GNU Lightning seems to be
the "best" choice from reading the brief description as it would work
for most architectures (Apple included), which would work for PHP best
interest as it is available for many platforms. From my reading, YASM
library only works for x86 and x86-64 architectures. Lightning is also
made for JIT, and therefore fits better for quick testing and deployment.
I'm not asking for anyone to take the project up, just what you think of
me doing something like this and your opinion on the merits of JIT
compiling.
This comes up once or twice a year. The machine code you compile to is
going to end up looking a lot like the current executor since you don't
have strong types to help you optimize anything. You'd still need to
pass the unions around and do runtime type juggling and all the overhead
that comes along with that.
The idea behind PHP from day one was that it was an environment for
wrapping compiled code. Things that are performance critical is written
in C/C++ and things that aren't are left in the PHP templates. Whether
you issue an SQL query from PHP or from a compiled C program doesn't
affect the overall performance of the system so you might as well do
that from PHP.
If you are calculating a fractal, you write it in C and expose it to PHP
with a get_fractal($args) function call so you can mark it up and easily
change the args passed to the underlying function.
It is really important for PHP to have as little overhead as possible
between itself and the speed-critical code behind it and less important
that the userspace executor is fast. That doesn't mean it should be
slow. It should be as fast as we can make it, but not at the cost of
convenience.
-Rasmus
Jacob Santos wrote:
Has anyone tried this or know of anyone who is interested in
implementing this for the Zend Engine? I tried searching the archives
and didn't find anything on this topic. (Would Google help? No, only
turns up some commercial PHP compiler for PHP 4.x.I believe it is possible currently using the Zend Engine and working it
either on top of APC or in place of APC. It would quite possibly help if
I ventured further into the Zend Engine and looked at APC source.Researching the topic has bought forth a very complex subject matter,
which I suppose is one reason why it hasn't been implemented yet. It is
easier (yes?) to compile to opcode and then interpret that or compile
directly to machine code than building a JIT for all known CPU
architectures (there goes two long years! More if I try to implement it
and I do plan on trying... and will fail at it, but it should be
interesting and fun to say the least).The reason why I ask is after looking at speed comparsions, PHP does
appear to fall behind even Ruby and Python. It is becoming difficult to
justify continuing coding using PHP based on what would appear to be
objective speed results. They perhaps, might not of used the APC or
optimizer in the speed comparisons.Discussions with my teacher on the subject matter further proved my
assertion that PHP would be better served with JIT compiler than APC
(Sorry! Sad but true). I will try to justify my statement and let more
intelligent people of this mailing list beat me down, if the case is
that I'm wrong.Native Machine code will always be quicker than interpreting Opcode (I
would so much assume that the PHP engine interprets and takes action
upon the passed Opcode to the engine using APC). The reason from my
research is that, well, you are passing the opcode through a layer
before hitting the CPU whereas the machine code can pass directly to the
CPU. Also machine code does not need to be interpreted by the machine
and saves from that overhead.It is possible to keep the PHP engine in control, while still running
the JIT compilation. Little fuzzy on exactly how this would work. Would
the compiled PHP script call PHP Engine, or would the PHP Engine call
the compiled PHP script, or keeping it all in memory and somehow
combining the two? Assembler seems quite fascinating, as well as
learning other tidbits about compiling and languages I did not know before.Two possible open source projects that would speed the process up
considerably are GNU Lightning (http://www.gnu.org/software/lightning/)
and YASM, the library not the compiler
(http://www.tortall.net/projects/yasm/wiki). GNU Lightning seems to be
the "best" choice from reading the brief description as it would work
for most architectures (Apple included), which would work for PHP best
interest as it is available for many platforms. From my reading, YASM
library only works for x86 and x86-64 architectures. Lightning is also
made for JIT, and therefore fits better for quick testing and deployment.I'm not asking for anyone to take the project up, just what you think of
me doing something like this and your opinion on the merits of JIT
compiling.
Rasmus Lerdorf wrote:
This comes up once or twice a year. The machine code you compile to is
going to end up looking a lot like the current executor since you don't
have strong types to help you optimize anything. You'd still need to
pass the unions around and do runtime type juggling and all the overhead
that comes along with that.
Strict typing is not required for JIT-level optimization. See:
Smalltalk, Self, Strongtalk, Tamarin. The Strongtalk guys, who have a
stricter-than-usual typed Smalltalk implementation, and a pretty good
JIT, actually ignore the type information during JIT analysis.
The idea behind PHP from day one was that it was an environment for
wrapping compiled code. Things that are performance critical is written
in C/C++ and things that aren't are left in the PHP templates. Whether
you issue an SQL query from PHP or from a compiled C program doesn't
affect the overall performance of the system so you might as well do
that from PHP.
And since day one, people have been building big ole PHP libraries ... :-)
For someone seriously interested in looking into this, there are some
'free' JITs available from the following projects: Apache Harmony (an
Apache licensed J2SE implementation), StrongTalk (Smalltalk), Adobe
Tamarin (recently donated to Mozilla), and Sun's HotSpot (Java;
curiously, HotSpot was based on StrongTalk!).
--
Patrick_Mueller@us.ibm.com
IBM PHP Community Architect, IBM Research Triangle Park
I believe it is possible currently using the Zend Engine and working it
either on top of APC or in place of APC. It would quite possibly help if
I ventured further into the Zend Engine and looked at APC source.Researching the topic has bought forth a very complex subject matter,
which I suppose is one reason why it hasn't been implemented yet. It is
easier (yes?) to compile to opcode and then interpret that or compile
directly to machine code than building a JIT for all known CPU
architectures (there goes two long years! More if I try to implement it
and I do plan on trying... and will fail at it, but it should be
interesting and fun to say the least).The reason why I ask is after looking at speed comparsions, PHP does
appear to fall behind even Ruby and Python. It is becoming difficult to
justify continuing coding using PHP based on what would appear to be
objective speed results. They perhaps, might not of used the APC or
optimizer in the speed comparisons.Discussions with my teacher on the subject matter further proved my
assertion that PHP would be better served with JIT compiler than APC
(Sorry! Sad but true). I will try to justify my statement and let more
You don't understand how APC is used. On large web sites that run APC
each PHP file will, eventually, end up being compiled once and end up in
the bytecode cache. On these sites the PHP code that is run is known and
does not change (often). A good administrator will ensure that the cache is large
enough so that you don't get thashing -- memory is cheap.
The above might not be true for an ISP that hosts many virtual sites, but
then APC may not help them much.
What APC does is to eliminate the continual recompilation to bytecode.
JIT is different. The language (eg Java) is compiled to a bytecode and then
often stored in a file. This bytecode will then run later, possibly on another
machine. The JIT translates the bytecode to machine code just prior to it
being executed - a function/... at a time, only translating the bits that
are needed.
JIT translates the bits of bytecode that are needed into machine code.
So you see that they are really different things. I suppose that you could
add an extra JIT step to APC that translated bytecode to machine code
as it is needed -- to get the big win it probably ought to cache the machine
code as well. I suspect that if this were to be done, APC/translator would
probably be best translating all the bytecode to machine code on the
assumption that most of the code is going to be run eventually anyway
and that the hit of a full translation is small considering that the code
is going to be run many times.
Quite how big the win would be I don't know.
Much PHP execution is NOT nice integer/... loops that make fast machine
code but involves function calls for things like string manipilation,
these functions are in machine code anyway (written in C). So the
improvement might not be as much as you expect.
There are also some PHP features that I suspect don't map to machine code well
(like the use of the symbol table). I'll let others talk about that.
It would make PHP more complicated to distribute since a translator to
CPU specific machine could would be needed.
intelligent people of this mailing list beat me down, if the case is
that I'm wrong.
Before thinking about this: try to demonstrate how much we would gain
on typical web sites.
--
Alain Williams
Linux Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
#include <std_disclaimer.h
The idea behind PHP from day one was that it was an environment for
wrapping compiled code. Things that are performance critical is written
in C/C++ and things that aren't are left in the PHP templates. Whether
you issue an SQL query from PHP or from a compiled C program doesn't
affect the overall performance of the system so you might as well do
that from PHP.If you are calculating a fractal, you write it in C and expose it to PHP
with a get_fractal($args) function call so you can mark it up and easily
change the args passed to the underlying function.
Ah yes, thank you (the people who built the system in PHP) for that, it
is very easy to use.
Well, more to the point, I don't think this should be a requirement. It
is possible to call PHP functions from within machine code, so more to
the point, why not build functions in PHP language and keep them as
machine code for each machine? The technology is there to where you
won't even have to interpret the source until you see it has been
modified and can do the conversion over again.
Not everyone knows C/C++, not everyone cares to know C/C++, not everyone
should need to know C/C++. That requirement for speed intensive code
bothered me and still does. Yes, thank you there are other languages,
but I'll rather learn PHP until I reach advanced level and I'm not there
yet. Should always learn one language really well before going on to
"greener" pastures.
The second reason is that it is harder to get someone to compile your
extension and include it than it is to get someone to use your PHP
language library or script. What you stated is what ADOdb does, but I
would suspect that not many that use it have the extension also. That is
the point, I shouldn't have to write C, unless the feature doesn't
exist. If I have to write C for PHP functions, then what is the point?
PHP does do a really great job of making it extremely easy to write
extensions or as much as is possible. You still have to worry about
whether or not the user is able to add the extension, the user knows
enough to install it, or would take the time to do so.
I do remember reading something that JIT compiling can be better than
regular compilation since it can do the sensing you described.
It is really important for PHP to have as little overhead as possible
between itself and the speed-critical code behind it and less important
that the userspace executor is fast. That doesn't mean it should be
slow. It should be as fast as we can make it, but not at the cost of
convenience.
Yes, I'm saying the userspace executor can be quicker using JIT
compilation. Well, to be completely accurate, there would be a little
tiny, you'll-barely-notice-it-is-there delay during the JIT stage. It
should be manageable.
What APC does is to eliminate the continual recompilation to bytecode.
Bytecode cache? That is interesting, I understand how APC is used, I
just don't know the implementation. It is nice that APC uses bytecode
instead of PHP opcodes as it might be a little bit quicker.
Why not go a step further and compile to machine code? I like the idea
of using opcode to transfer from machine to machine (such as in Java),
but it was always confusing why PHP's standard package never just saved
the opcode. Sure there are compilers (some don't work well, others you
have to pay for), but the technology already exists in PHP.
The idea of using JIT compiling is really to do away with continuing
interpreting, compiling, and running to slice it down and save the
machine code for later reading. You can cut that all down to just one
stage. PHP runs compiled code on top of itself and does the whole
interpreting stage if it needs to.
It would make PHP more complicated to distribute since a translator to
CPU specific machine could would be needed.
That would be where Lightning library would come in. I would suspect
that the library would be able to sense the CPU architecture at compile
time.
Before thinking about this: try to demonstrate how much we would gain
on typical web sites.
Ah yes, the million dollar question. How much would be benefited from
JIT compiling? Would there be a benefit? Currently with the commercial
PHP compiler, there is an overall increase of milliseconds (not much if
you think of it). The only way to answer this is doing it and giving the
results. Give me a year or two, I just didn't want to start it if
someone else was doing it.
The discussion with my teacher was that when you have an algorithm that
uses an loop in an interpreted language it will always be faster to have
that in machine code (eh, you could argue that a poor assembler
algorithm would be slower than a better written C algorithm). There, I
would assume, is overhead in the PHP engine where you have to run the
code, reading it, running the compiled routine. If it was machine code
PHP would not be required to read the Opcode and would remove that
overhead from the process.
How much would you save? A few milliseconds at least. When you have a
horrible loop intensive algorithm, you would be able to tell the
difference as my teacher was. I think it is pretty... well, I'll rather
would like to stay away from C web programming thank you.
Most Web sites only take a few milliseconds to process anyway and most
overhead is from SQL transactions. Still, I contend that JIT compiling
would still be worth looking at.
Really, what I would think the biggest overhead is at is (even with APC)
interpreting and compiling the source. It is possible to save the
machine code for later use for shared hosting. For the idea of using
APC, I kind of like it and will take a look at the source to see if it
is possible for me.
JIT doesn't have to be a function at a time, it can be whole classes,
files, or the entire script.
The symbol table would map better to assembler, I mean machine code
since basically you are working with stacks anyway. The PHP symbol table
in C is just easier to write and maintain. That probably would have to
be compiled at run time.
Personally, I would like to do without the whole interpreting of the
source and have PHP as a completely compiled (as in bytecode) language.
I suspect that would not go off to well, so therefore I propose
something that is compatible with what PHP does now. Should also be
easier as everything I would need to do already exists.
Jacob Santos
You don't understand how APC is used. On large web sites that run APC
each PHP file will, eventually, end up being compiled once and end up in
the bytecode cache. On these sites the PHP code that is run is known and
does not change (often). A good administrator will ensure that the cache is large
enough so that you don't get thashing -- memory is cheap.
The above might not be true for an ISP that hosts many virtual sites, but
then APC may not help them much.
JIT is different. The language (eg Java) is compiled to a bytecode and then
often stored in a file. This bytecode will then run later, possibly on another
machine. The JIT translates the bytecode to machine code just prior to it
being executed - a function/... at a time, only translating the bits that
are needed.JIT translates the bits of bytecode that are needed into machine code.
So you see that they are really different things. I suppose that you could
add an extra JIT step to APC that translated bytecode to machine code
as it is needed -- to get the big win it probably ought to cache the machine
code as well. I suspect that if this were to be done, APC/translator would
probably be best translating all the bytecode to machine code on the
assumption that most of the code is going to be run eventually anyway
and that the hit of a full translation is small considering that the code
is going to be run many times.Quite how big the win would be I don't know.
Much PHP execution is NOT nice integer/... loops that make fast machine
code but involves function calls for things like string manipilation,
these functions are in machine code anyway (written in C). So the
improvement might not be as much as you expect.There are also some PHP features that I suspect don't map to machine code well
(like the use of the symbol table). I'll let others talk about that.It would make PHP more complicated to distribute since a translator to
CPU specific machine could would be needed.intelligent people of this mailing list beat me down, if the case is
that I'm wrong.
Given the discussion this has provoked, it seems to this naive reader
like it might be a good candidate for SoC.
Just an idea...
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some starving artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
Has anyone tried this or know of anyone who is interested in implementing
this for the Zend Engine?
actually yes. Gopal (gopalv@php) made a non-public (but working) prototype
of a JIT PHP version. It was based on libjit (used by dotgnu).
I also have a good background in compilers and I would be interested in such
project (as well as a better GC).
I believe it is possible currently using the Zend Engine and working it
either on top of APC or in place of APC. It would quite possibly help if I
ventured further into the Zend Engine and looked at APC source.
Forget APC. You can hook directly in the engine and do the JITing from
there. APC could later be extended to also cache the compiled code, but at
first shot you don't need it.
I don't think it is a difficult task to add JIT compilation to PHP, but we
need someone that steps forward and do the first part and then convince the
other developers that it is good.
Ah please remember that PHP works in a huge number of platforms (from a
toaster to a mainframe), so the library you choose must be really portable
and support many archs.
Nuno
Has anyone tried this or know of anyone who is interested in
implementing this for the Zend Engine?
http://www.php-mag.net/magphpde/magphpde_article/psecom,id,729,nodeid,21.html
I'd settle for faster method dispatching, but JIT would be great. Can
always use the speed. :)
The idea behind PHP from day one was that it was an environment for
wrapping compiled code. Things that are performance critical is written
in C/C++ and things that aren't are left in the PHP templates. Whether
Best laid plans... :)
Does remind me about the quote regarding the total market for
computers being eight.
:)