Hi,
I've added a new RFC to the wiki
(http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
and decouple the Zend engine from the libraries, in order to allow
large scale changes to the Zend engine in the future. The RFC
describes a prototype phase of the project, which could reasonably be
done within a GSOC project, so I have added it to the GSOC 09 page
(http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).
If anybody has any comments, I'd be delighted to hear them. If anybody
knows (or is) a good student looking for a GSOC project (and I've left
it late, there are only 3 days left to apply), please encourage the
student to look at this. Finally, if anybody is interested in helping
mentor this as part of the GSOC, I'd be grateful for the help (I have
to start writing my thesis soon).
Thanks,
Paul
--
Paul Biggar
paul.biggar@gmail.com
Hi Paul et all,
I fully understand (and even share) your motivations and goals. However it
seems to me that describing an extension in PHP will lead to loss of
performance, as you cannot capture certain C features in PHP. For example,
there are some internal functions that rely on pointer arithmetic to get
decent performance.
Then you may extend to PHP to better capture these "dirty tricks", and then
you'll end up with some DSL for building PHP extensions. It's not
necessarily bad, it's just a lot of work.. :)
Moreover, in your example in the wiki you don't include how you would do
parameter parsing. Or do you rely on the code generator to look at the C
functions signatures and figure out by itself what to do? (actually there is
some ambiguity, AFAIR, and thus guessing cannot be done reliably)
To summarize my e-mail, I believe this is a very interesting idea, but needs
a lot more thinking :) It's a nice SoC project nethertheless.
Nuno
----- Original Message -----
Hi,
I've added a new RFC to the wiki
(http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
and decouple the Zend engine from the libraries, in order to allow
large scale changes to the Zend engine in the future. The RFC
describes a prototype phase of the project, which could reasonably be
done within a GSOC project, so I have added it to the GSOC 09 page
(http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).If anybody has any comments, I'd be delighted to hear them. If anybody
knows (or is) a good student looking for a GSOC project (and I've left
it late, there are only 3 days left to apply), please encourage the
student to look at this. Finally, if anybody is interested in helping
mentor this as part of the GSOC, I'd be grateful for the help (I have
to start writing my thesis soon).Thanks,
Paul
Hi Paul et all,
I fully understand (and even share) your motivations and goals. However it
seems to me that describing an extension in PHP will lead to loss of
performance, as you cannot capture certain C features in PHP. For example,
there are some internal functions that rely on pointer arithmetic to get
decent performance.
This is not about capturing every C feature. Instead, it is about
strictly separating the C and PHP code. If someone wants to C pointer
arithmetic, it is simple to code it on the C side of the line. Its not
necessary to expose the exact C function from the library. Sometime,
you may wish to to have a C function wrapping it, to do some "dirty
tricks".
Then you may extend to PHP to better capture these "dirty tricks", and then
you'll end up with some DSL for building PHP extensions. It's not
necessarily bad, it's just a lot of work.. :)
This - which I'll call the Pyrex model - is one way to go, but its not
my preference. While I think it beats the current model, I hope that
it won't be required with whatI propose in the RFC.
Moreover, in your example in the wiki you don't include how you would do
parameter parsing. Or do you rely on the code generator to look at the C
functions signatures and figure out by itself what to do? (actually there is
some ambiguity, AFAIR, and thus guessing cannot be done reliably)
That is exactly right. (I'll make this clearer in the RFC). I can't
think of any cases where guess cannot be done reliably. If you can
give me an example, I'll try and address it.
To summarize my e-mail, I believe this is a very interesting idea, but needs
a lot more thinking :) It's a nice SoC project nethertheless.
It certainly does need more thinking, and I'm hoping that people can
pick holes in it, so that I can fill them. A SoC project would be
ideal, as it would probably expose - and hopefully solve - a great
deal of flaws.
Thanks for your comments, I'll try and update the RFC in response.
Paul
----- Original Message -----
Hi,
I've added a new RFC to the wiki
(http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
and decouple the Zend engine from the libraries, in order to allow
large scale changes to the Zend engine in the future. The RFC
describes a prototype phase of the project, which could reasonably be
done within a GSOC project, so I have added it to the GSOC 09 page
(http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).If anybody has any comments, I'd be delighted to hear them. If anybody
knows (or is) a good student looking for a GSOC project (and I've left
it late, there are only 3 days left to apply), please encourage the
student to look at this. Finally, if anybody is interested in helping
mentor this as part of the GSOC, I'd be grateful for the help (I have
to start writing my thesis soon).Thanks,
Paul
--
Paul Biggar
paul.biggar@gmail.com
Hi,
Moreover, in your example in the wiki you don't include how you would do
parameter parsing. Or do you rely on the code generator to look at the C
functions signatures and figure out by itself what to do? (actually there is
some ambiguity, AFAIR, and thus guessing cannot be done reliably)That is exactly right. (I'll make this clearer in the RFC). I can't
think of any cases where guess cannot be done reliably. If you can
give me an example, I'll try and address it.
Well, take your example:
void Y(char *, int)
Is the second parameter the length of the string or something
independent? Is the char* changed? And who is going to free it?
johannes
2009/4/1 Johannes Schlüter johannes@schlueters.de:
Hi,
Moreover, in your example in the wiki you don't include how you would do
parameter parsing. Or do you rely on the code generator to look at the C
functions signatures and figure out by itself what to do? (actually there is
some ambiguity, AFAIR, and thus guessing cannot be done reliably)That is exactly right. (I'll make this clearer in the RFC). I can't
think of any cases where guess cannot be done reliably. If you can
give me an example, I'll try and address it.Well, take your example:
void Y(char *, int)
Is the second parameter the length of the string or something
independent? Is the char* changed? And who is going to free it?
Good points. I had initially thought that there should be some simple
declarative DSL, and later thought 'why can't it be a header file in
the simple case'. I guess this is why.
I think that to handle more complex cases we need the kind of
information which makes it straightforward to easily generate code to
make a seamless interface between C and the engine API. The only case
I had thought of was to somehow mangle structs/pointers into
resources. But I suppose we need lengths for strings. I expect (many?)
more of these cases will come up.
(Of course, this is why I recommended a SoC project to try it)
Thanks for the comments, I'll update the RFC.
Paul
--
Paul Biggar
paul.biggar@gmail.com
Hi,
I think that to handle more complex cases we need the kind of
information which makes it straightforward to easily generate code to
make a seamless interface between C and the engine API. The only case
I had thought of was to somehow mangle structs/pointers into
resources. But I suppose we need lengths for strings. I expect (many?)
more of these cases will come up.
Well, as soon as any pointer exists you need manual work for a special
case. And even when only using integers it's not fully fast-forward:
There are cases where not the full integer range is allowed but just a
few flags or some specific range. C programmers will know that, passing
that 1:1 to PHP userland can be bad.
For simple cases http://pecl.php.net/package/ffi might be enough, for
average cases there are just a few APIs (PHP_FUNCTION,
zend_parse_parameters, RETURN_*) one has to know for a start for an
extensions, Hartmut's CodeGen_PECL abstracts that using some XML and
then there's PEAR's Inline_C as some "weird" approach.
I'd be happy to have some simple toolkit for this, but I guess it's
really hard to make some easy tool which really works in average cases
not just in proof-of-concept cases. This might also be interesting for
other projects like ProjectZero (PHP using a JVM) or pipp (using Parrot)
johannes
2009/4/1 Johannes Schlüter johannes@schlueters.de:
Hi,
I think that to handle more complex cases we need the kind of
information which makes it straightforward to easily generate code to
make a seamless interface between C and the engine API. The only case
I had thought of was to somehow mangle structs/pointers into
resources. But I suppose we need lengths for strings. I expect (many?)
more of these cases will come up.Well, as soon as any pointer exists you need manual work for a special
case. And even when only using integers it's not fully fast-forward:
There are cases where not the full integer range is allowed but just a
few flags or some specific range. C programmers will know that, passing
that 1:1 to PHP userland can be bad.
Well, it depends what the pointer does of course. I dont know if we
need to support the general case of 'anything goes with pointers'.
Instead, I had been thinking that the pointer would be a pointer to a
struct, in the manner of 'OO-in-C'.
For simple cases http://pecl.php.net/package/ffi might be enough, for
average cases there are just a few APIs (PHP_FUNCTION,
zend_parse_parameters, RETURN_*) one has to know for a start for an
extensions, Hartmut's CodeGen_PECL abstracts that using some XML and
then there's PEAR's Inline_C as some "weird" approach.
I'll take a look at these, thanks for the pointers. However, the main
idea is not exactly what we use, just that we no longer use the Zend
API.
I'd be happy to have some simple toolkit for this, but I guess it's
really hard to make some easy tool which really works in average cases
not just in proof-of-concept cases. This might also be interesting for
other projects like ProjectZero (PHP using a JVM) or pipp (using Parrot)
Yes. This is one of the motivations. In theory, Project Zero (et al)
would generate their own code from the library spec. AFAIK, they
currently go through the Zend API, which I believe there not too happy
about.
Thanks,
Paul
--
Paul Biggar
paul.biggar@gmail.com
Hi Paul,
This is something I have considered in the past esp. as it would also reduce dependency of extensions on PHP runtime and make it easier for 3rd parties to distribute PHP extensions which don't have to be rebuilt per-PHP version. This is similar to JNI.
There are some real challenges though and JNI is a good example of those challenges. In order to completely abstract the API from data structure you need higher level API calls esp. for things like arrays and objects which typically incur a significant performance loss. JNI sucks big time on that front. Also it often leads to additional data copying.
Also this doesn't necessarily have to replace the Zend API but in fact be an engine independent API. Over time if everyone adopts then we could get rid of Zend API. However, if what I say above is correct, we may find that it's actually very complementary and that some core extensions prefer to hook into the engine very tightly while third parties (e.g. pdflib) and less core extensions prefer to stick to an independent API which can work across not only mini release of PHP but also minor and in some cases major release of PHP.
This API would need to be designed in great detail and we would need to make sure it is long lasting.
My 2 cents.
Andi
-----Original Message-----
From: Paul Biggar [mailto:paul.biggar@gmail.com]
Sent: Monday, March 30, 2009 4:07 PM
To: PHP Internals
Subject: [PHP-DEV] RFC: Removing the Zend APIHi,
I've added a new RFC to the wiki
(http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
and decouple the Zend engine from the libraries, in order to allow
large scale changes to the Zend engine in the future. The RFC
describes a prototype phase of the project, which could reasonably be
done within a GSOC project, so I have added it to the GSOC 09 page
(http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).If anybody has any comments, I'd be delighted to hear them. If anybody
knows (or is) a good student looking for a GSOC project (and I've left
it late, there are only 3 days left to apply), please encourage the
student to look at this. Finally, if anybody is interested in helping
mentor this as part of the GSOC, I'd be grateful for the help (I have
to start writing my thesis soon).Thanks,
Paul--
Paul Biggar
paul.biggar@gmail.com
Hi Paul,
This is something I have considered in the past esp. as it would also reduce dependency of extensions on PHP runtime and make it easier for 3rd parties to distribute PHP extensions which don't have to be rebuilt per-PHP version. This is similar to JNI.
It is similar to JNI. This has been done many times before for many
languages, including Pythons Pyrex and ctypes, Ruby's FFI, Java's JNI
and JNA, and no doubt countless others. The only difference here is
that I recommend that we made this the only interface (or as close
as we can make it) from the interpreter internals.
There are some real challenges though and JNI is a good example of those challenges. In order to completely abstract the API from data structure you need higher level API calls esp. for things like arrays and objects which typically incur a significant performance loss. JNI sucks big time on that front. Also it often leads to additional data copying.
All of this happens at the moment to marshal data into zvals. My RFC
does not intend to add to this complexity, but rather to make it work
exactly the same as it does now. So if currently a library avoids
copying values, it should be possible to keep that property. If the
library cannot currently avoid it, I do not expect to be able to avoid
it with a new scheme.
This is very much more important for PHP that JNI is to Java. Every
library shipped with PHP (including most of SPL I believe) is tightly
coupled to the interpreter. By contrast, the vast majority of Java's
Class library is written in Java.
Also this doesn't necessarily have to replace the Zend API but in fact be an engine independent API. Over time if everyone adopts then we could get rid of Zend API. However, if what I say above is correct, we may find that it's actually very complementary and that some core extensions prefer to hook into the engine very tightly while third parties (e.g. pdflib) and less core extensions prefer to stick to an independent API which can work across not only mini release of PHP but also minor and in some cases major release of PHP.
It doesn't have to, but I think it should. But it would be insane to
expect a new scheme to replace the current one, unless it works
universally.
Core "extensions", like important array and string functions, will
probably need to be tightly coupled to the interpreter. Some other
extensions would too, like Xdebug. If people could suggest other
extensions which should not be decoupled, I would appreciate it.
This API would need to be designed in great detail and we would need to make sure it is long lasting.
I could not agree more.
My 2 cents.
Andi
Thanks for the input, the more the merrier :)
Paul
-----Original Message-----
From: Paul Biggar [mailto:paul.biggar@gmail.com]
Sent: Monday, March 30, 2009 4:07 PM
To: PHP Internals
Subject: [PHP-DEV] RFC: Removing the Zend APIHi,
I've added a new RFC to the wiki
(http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
and decouple the Zend engine from the libraries, in order to allow
large scale changes to the Zend engine in the future. The RFC
describes a prototype phase of the project, which could reasonably be
done within a GSOC project, so I have added it to the GSOC 09 page
(http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).If anybody has any comments, I'd be delighted to hear them. If anybody
knows (or is) a good student looking for a GSOC project (and I've left
it late, there are only 3 days left to apply), please encourage the
student to look at this. Finally, if anybody is interested in helping
mentor this as part of the GSOC, I'd be grateful for the help (I have
to start writing my thesis soon).Thanks,
Paul--
Paul Biggar
paul.biggar@gmail.com--
--
Paul Biggar
paul.biggar@gmail.com
Hi Andi,
Hi Paul,
This is something I have considered in the past esp. as it would also reduce dependency of extensions on PHP runtime and make it easier for 3rd parties to distribute PHP extensions which don't have to be rebuilt per-PHP version. This is similar to JNI.
I'm working on this a little more atm. When you were considering it
before, was anything committed to paper/email? If so, I would be
interested to see any thoughts or discussions.
Thanks,
Paul
--
Paul Biggar
paul.biggar@gmail.com
Paul Biggar paul.biggar@gmail.com wrote on 31/03/2009 00:06:33:
I've added a new RFC to the wiki
(http://wiki.php.net/rfc/remove_zend_api). It details a plan to try
and decouple the Zend engine from the libraries, in order to allow
large scale changes to the Zend engine in the future. The RFC
describes a prototype phase of the project, which could reasonably be
done within a GSOC project, so I have added it to the GSOC 09 page
(http://wiki.php.net/gsoc/2009#prototyping_removal_of_the_zend_api).
Hi Paul,
This is certainly an interesting project. I work on ProjectZero and
I see from the wiki that you have looked at the approach we have taken.
As you correctly point out Project Zero wants to allow users to re-use the
majority of PHP extensions without re-writing them and as you observe,
using the existing interface as we do today brings a number of problems.
We would also like to enable others to attach arbitrary PHP extensions
written in C to ProjectZero.
So we would like to see the "PHP Native Interface" be successful and would
like to help if we can.
A few of the most significant issues from my perspective:
- PHP arrays present a significant issue. Look at the code in array.c.
Much of this code rummages directly in the internals of the Zend Engine
implementation
of hashtables and needs to in order to achieve reasonable performance. We
were unable
to attach this code to a JVM implementation of PHP and rewrote it in Java.
Perhaps we will need to accept that the array manipulation functions and a
small set
of other built-in extensions must continue to use the internal interfaces.
Its also worth mentioning that today many extension make use of the Zend
HashTable implementation for their own purposes (as a general library
function)
in addition to using the HashTable as an interface.
-
Memory management. If we separate extensions from the internal
implementation of Zvals
then it becomes difficult to manage memory allocated by the extension
during a request.
This "falls out in the wash" today because extensions participate in the
Zend engine's reference
counting scheme which allows memory to be de-allocated once the refcount
falls to zero. -
A logistical problem seems to me that in order for this project to gain
traction
a significant number of extensions would need to adopt it. In order for
extensions
to adopt it, we would need to convince their maintainers that the project
had traction.
I wonder whether improving the interface could be combined with some of
the unicode work
so that the resulting porting work for unicode was simpler?
Rob Nicholson
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU