The fact that hhvm implements a significant part of the extensions (or
other areas) using PHP+additional syntax as well as adding cleaner
APIs or mechanisms for the C parts only confirms me one thing: the
very 1st problem we have to solve is to ease the extension creation,
by drastically changing the internals APIs & tools. Bundling script
does not help here, we are using a scotch tape to repair something
that should have been replaced or redesigned since long already. I am
not blaming anyone, the engine design, historically, does not make
such changes easy.
Funny to see you mention this. I literally just pulled together a
meeting today to discuss HHVM's admittedly unstable extension API.
One idea to emerge from this was to design a new extension API
agnostic of underlying implementation. An API which, if done right,
could be adapted equally to both PHP7 and HHVM and provide the
opportunity to introduce PHP5 shims so that a single extension could
be written to function identically under any PHP runtime, and any
version. If done right, could make extensions not just source
compatible, but binary compatible as well. The engine details can
change, but the public-facing extension API could offer a consistent
way of doing the one and only thing that extensions should have to do:
Glue PHP into external libraries.
That's a bit pie in the sky, I'll admit, but wouldn't that be cool?
Fact is, JNI does this for Java already, so there's precedence to
learn from. Heck, we're actually halfway there with HHVM's
"ext_zend_compat" layer, which makes PHP extensions (mostly) source
compatible with HHVM.
While I could work on this in the dark, manipulating HHVM's APIs with
one hand and adding proxy interfaces to PHP (as an extension) with the
other, I'd much rather have involvement from others.
What do you think?
-Sara
Hey Sara,
Funny to see you mention this. I literally just pulled together a
meeting today to discuss HHVM's admittedly unstable extension API.
One idea to emerge from this was to design a new extension API
agnostic of underlying implementation. An API which, if done right,
could be adapted equally to both PHP7 and HHVM and provide the
opportunity to introduce PHP5 shims so that a single extension could
be written to function identically under any PHP runtime, and any
version. If done right, could make extensions not just source
compatible, but binary compatible as well. The engine details can
change, but the public-facing extension API could offer a consistent
way of doing the one and only thing that extensions should have to do:
Glue PHP into external libraries.That's a bit pie in the sky, I'll admit, but wouldn't that be cool?
Fact is, JNI does this for Java already, so there's precedence to
learn from. Heck, we're actually halfway there with HHVM's
"ext_zend_compat" layer, which makes PHP extensions (mostly) source
compatible with HHVM.While I could work on this in the dark, manipulating HHVM's APIs with
one hand and adding proxy interfaces to PHP (as an extension) with the
other, I'd much rather have involvement from others.What do you think?
Yes, I like this idea. Three APIs for TheFacebook under the sky,
Seven for the Zend-lords in their halls of stone,
Nine for the Engines doomed to die,
One for the Dark Lord on his dark throne,
In the Land of Native where the Undefined Behaviours lie.
One API to rule them all, One API to find them,
One API to bring them all and in the darkness bind them
In the Land of Native where the Undefined Behaviours lie.
ahem Er, sorry about that.
I think this is a good idea. Most PHP extensions do not require tight coupling to the Zend Engine (or indeed HHVM). Something simple which exposes class, function and constant definition, and abstract PHP value (zval) handling, would be enough for the vast majority of extensions, I should think.
There’ll always be a place for tightly-coupled extensions which need to use the engine’s “native” API, but something simple and cross-implementation would work for most extensions, allow better cross-implementation PHP code compatibility, and generally be a Good Thing(TM), I think.
--
Andrea Faulds
http://ajf.me/
Three APIs for TheFacebook under the sky,
Seven for the Zend-lords in their halls of stone,
Nine for the Engines doomed to die,
One for the Dark Lord on his dark throne,
In the Land of Native where the Undefined Behaviours lie.
One API to rule them all, One API to find them,
One API to bring them all and in the darkness bind them
In the Land of Native where the Undefined Behaviours lie.
slowclap
I think this is a good idea. Most PHP extensions do not require
tight coupling to the Zend Engine (or indeed HHVM). Something
simple which exposes class, function and constant definition,
and abstract PHP value (zval) handling, would be enough for
the vast majority of extensions, I should think.
Exactly. This is what we learned from making ext_zend_compat. Basic
glue extensions are the majority, those places where a lot of
extensions currently "link deep" are simply because they can, but not
because they need to.
There’ll always be a place for tightly-coupled extensions which
need to use the engine’s “native” API, but something simple and
cross-implementation would work for most extensions, allow
better cross-implementation PHP code compatibility, and generally
be a Good Thing(TM), I think.
For the handful that do funky stuff like OpCache+, Xdebug, etc... They
can explicitly reach behind the artifice with the understanding that
they're not necessarily going to work from version to version without
loads of #ifdefs and kinky machinations.
-Sara
The fact that hhvm implements a significant part of the extensions (or
other areas) using PHP+additional syntax as well as adding cleaner
APIs or mechanisms for the C parts only confirms me one thing: the
very 1st problem we have to solve is to ease the extension creation,
by drastically changing the internals APIs & tools. Bundling script
does not help here, we are using a scotch tape to repair something
that should have been replaced or redesigned since long already. I am
not blaming anyone, the engine design, historically, does not make
such changes easy.Funny to see you mention this. I literally just pulled together a
meeting today to discuss HHVM's admittedly unstable extension API.
One idea to emerge from this was to design a new extension API
agnostic of underlying implementation. An API which, if done right,
could be adapted equally to both PHP7 and HHVM and provide the
opportunity to introduce PHP5 shims so that a single extension could
be written to function identically under any PHP runtime, and any
version. If done right, could make extensions not just source
compatible, but binary compatible as well. The engine details can
change, but the public-facing extension API could offer a consistent
way of doing the one and only thing that extensions should have to do:
Glue PHP into external libraries.That's a bit pie in the sky, I'll admit, but wouldn't that be cool?
Fact is, JNI does this for Java already, so there's precedence to
learn from. Heck, we're actually halfway there with HHVM's
"ext_zend_compat" layer, which makes PHP extensions (mostly) source
compatible with HHVM.While I could work on this in the dark, manipulating HHVM's APIs with
one hand and adding proxy interfaces to PHP (as an extension) with the
other, I'd much rather have involvement from others.What do you think?
Hi Sara,
Somewhat Pie in the sky but at the same time also achievable with good
planning and design. I'd support this definitely and this abstraction layer
would solve constant maintainability of extensions by not having to update
themselves constantly to support latest versions of PHP due to a lack of
abstraction.
Did that make sense ?
-Sara
Somewhat Pie in the sky but at the same time also achievable with good
planning and design. I'd support this definitely and this abstraction layer
would solve constant maintainability of extensions by not having to update
themselves constantly to support latest versions of PHP due to a lack of
abstraction.
Yep. Exactly what I'm thinking.
I forsee defining this PHP<->Native Interface (I hereby dub it "PNI")
as part of the php-langspec such that part of being a conforming
implementation includes supporting this API.
-Sara
Hi Sara,
Obviously a great idea that would make a lot of people's life easier all
around.
One question though? How would api extensions be handled on both sides and
would it be at all possible to keep the APIs consistent across hvvm / php
versions?
I imagine that there would need to be consensus on both php / hhvm teams to
add
any extra functionality, is this correct?
On Fri, Jan 9, 2015 at 5:48 PM, Paul Dragoonis dragoonis@gmail.com
wrote:Somewhat Pie in the sky but at the same time also achievable with good
planning and design. I'd support this definitely and this abstraction
layer
would solve constant maintainability of extensions by not having to
update
themselves constantly to support latest versions of PHP due to a lack of
abstraction.Yep. Exactly what I'm thinking.
I forsee defining this PHP<->Native Interface (I hereby dub it "PNI")
as part of the php-langspec such that part of being a conforming
implementation includes supporting this API.-Sara
Hi!
Funny to see you mention this. I literally just pulled together a
meeting today to discuss HHVM's admittedly unstable extension API.
One idea to emerge from this was to design a new extension API
agnostic of underlying implementation. An API which, if done right,
What extensions do right now is pretty closely linked to how the engine
is functioning. Of course, for simple things like "take number from C
library, wrap it in zval, send to PHP" the API would be pretty simple,
but if you have something like simplexml, it relies a lot on the details
of the engine and how things are done there.
So I think it would be useful to define what exactly would be covered
and what not, i.e. which APIs are supported, which can be supported,
which can not. Right now we don't even have the API as such for the
engine itself (i.e., you can just go to EG(...) or bits of the zval and
mess with them, and not only you can but many extensions do).
Also I wonder how that would sit with phpng effort. Lately it was very
focused on performance-based optimizations, and working via abstract
APIs may not be always the most performant way of doing things.
learn from. Heck, we're actually halfway there with HHVM's
"ext_zend_compat" layer, which makes PHP extensions (mostly) source
compatible with HHVM.
Is there some docs for what ext_zend_compat does and which APIs it
supports/doesn't support?
Stas Malyshev
smalyshev@gmail.com
So I think it would be useful to define what exactly would be covered
and what not, i.e. which APIs are supported, which can be supported,
which can not. Right now we don't even have the API as such for the
engine itself (i.e., you can just go to EG(...) or bits of the zval and
mess with them, and not only you can but many extensions do).
My idea is to cover most (but not all) extensions with a narrower,
simplified API. As you say, many interactions fall into the "marshal
this engine value to a C type as needed". This can cater to the
bread-and-butter of PHP extensions very well.
I'd want to export something like the Z_TYPE/Z_VAL style macros (or a
variant thereof, maybe... dare I say it "pval" based?),
zend_(hash|symtable)_() APIs (though maybe more like the add_() APIs
and the array_() APIs I proposed a year or so ago for dealing with
arrays. obj_(read|write)(dim|prop)(), invoke(func|method)() for
calling into userspace, etc...
To cover the least-common denominator, it'd have to be a C api, but in
my perfect world we offer a C++ api alongside it. I can't tell you
how much more enjoyable HHVM's array API is. int total =
arr["v1"].toInt64() + arr["v2"].toInt64(); versus... well, you know
how many lines that'd take in a PHP extension... Keeping these in
sync doesn't have to be difficult since one can call through the
other. Add some inline hinting and it doesn't even need to be
expensive.
As for reaching deep into the engine, I think we can just offer better
APIs. So like, where exts currently use EG(function_table), they can
use function_invoke_named("strtolower", arg1, arg2, arg3); or f =
function_find("strtolower"); function_invoke(f, arg1, arg2, arg3);
for cachable functions. In the API layer, typeof(f) is a php_function
which translates to a struct of zend_fcall_info/zend_fcall_info_cache
for PHP, or HPHP::Func* for HHVM. (As an example)
Similarly EG(symbol_table)/EG(active_symbol_table) can hide behind
either table = symbol_table_get(LOCAL|GLOBAL); array_set_value(table,
"foo", pval); or maybe something more specific like
symoltable_set_value("foo", pval, LOCLA|GLOBAL);
And on and on and on for other EG() style accesses. I don't think we
need to try to address deep engine elements like
EG(current_execute_data) (or whatever it's called, it's been awhile...
:p). There's a line in there somewhere that we can say "for this,
just reach behind the scenes, but you'll need to fix your stuff from
version to version and all the #ifdefs that entails". Like... I don't
think we need to support runkit, for example. :)
Also I wonder how that would sit with phpng effort. Lately it was very
focused on performance-based optimizations, and working via abstract
APIs may not be always the most performant way of doing things.
It's PHPNG that makes this more possible since the structure of the
PHP7 zval is so much closer to what HHVM has been doing with
TypedValue. It's possible to make an abstraction which knows how to
talk to those two, at least, without either having to suffer for it.
Similarly, request-local variables are much more similar now that
tsrm_ls is gone. It's PHPNG that makes such a compatibility
abstraction possible. Yes, we'll need to be very careful not to hurt
performance for either implementation. That would suck and I would
get my butt kicked by both sides. I don't want my butt kicked.
learn from. Heck, we're actually halfway there with HHVM's
"ext_zend_compat" layer, which makes PHP extensions (mostly) source
compatible with HHVM.Is there some docs for what ext_zend_compat does and which APIs it
supports/doesn't support?
No, building a PHP ext against HHVM just sort of "does" or "does not"
work. The source is under hphp/runtime/ext_zend_compat if you're
curious. The general idea is that it re-exports ZendEngine's headers,
implementing some C apis, defining/inlining others where possible.
-Sara
So I think it would be useful to define what exactly would be covered
and what not, i.e. which APIs are supported, which can be supported,
which can not. Right now we don't even have the API as such for the
engine itself (i.e., you can just go to EG(...) or bits of the zval and
mess with them, and not only you can but many extensions do).My idea is to cover most (but not all) extensions with a narrower,
simplified API. As you say, many interactions fall into the "marshal
this engine value to a C type as needed". This can cater to the
bread-and-butter of PHP extensions very well.
More than adding a new layer, I would love to see something similar to
hhvm or Zephir available by default. If C is used, then only the
relevant parts have to be implement by the developers, skipping all
the over complicated data mangling, swapping, exchange, etc between
userland and the engine. It is then relatively easy to end up
generating codes for either php, hhvm or any other platform. Using
builtin script (yes, in this case it could be a nice thing), it could
became a very nice way to develop php&co extensions.
For what I can imagine (I did not remotely try to implement it yet) is
to find a way to parse, say, a php script which include custom
sections for C (or C++) codes. We could use comments but I do not like
the idea, mainly because it will be tricky to have editors support :)
One problem, I do not think it is possible to customize the current
lexer to allow that on demand, but it could be possible using a more
modern lexer tools. I am not sure how flexible the hhvm lexer is or if
we should have yet another language (as I would rather use plain PHP
in this case, even if it makes the task slightly harder to implement
or generate slightly bigger native code due to type checking or
conversions).
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
De : Pierre Joye [mailto:pierre.php@gmail.com]
More than adding a new layer, I would love to see something similar to
hhvm or Zephir available by default. If C is used, then only the
relevant parts have to be implement by the developers, skipping all
the over complicated data mangling, swapping, exchange, etc between
userland and the engine. It is then relatively easy to end up
generating codes for either php, hhvm or any other platform. Using
builtin script (yes, in this case it could be a nice thing), it could
became a very nice way to develop php&co extensions.For what I can imagine (I did not remotely try to implement it yet) is
to find a way to parse, say, a php script which include custom
sections for C (or C++) codes. We could use comments but I do not like
the idea, mainly because it will be tricky to have editors support :)
I am not sure we need a real language here (PHP or other). I would first wait to see if a set of C macros can't do the job.
Cheers
François
More than adding a new layer, I would love to see something similar to
hhvm or Zephir available by default. If C is used, then only the
relevant parts have to be implement by the developers, skipping all
the over complicated data mangling, swapping, exchange, etc between
userland and the engine. It is then relatively easy to end up
generating codes for either php, hhvm or any other platform. Using
builtin script (yes, in this case it could be a nice thing), it could
became a very nice way to develop php&co extensions.
And this is one of the things I love about HHVM's current extension
API, so yeah I'd like to bring some elements of that in if possible.
At one extreme it means changes to the lexer/parser, and the other end
it might mean a combination of prepend files (which are pure PHP) and
either an interface-definition file (simplified version of PHP) or
something like PHP's current API "inject a function/method here, and
here's the signature". I imagine this being one of the longer
portions of our discussions since a lot of folks are going to have
opinions on what approach is better.
For what I can imagine (I did not remotely try to implement it yet) is
to find a way to parse, say, a php script which include custom
sections for C (or C++) codes. We could use comments but I do not like
the idea, mainly because it will be tricky to have editors support :)
That's ambitious, but not impossible. :)
One problem, I do not think it is possible to customize the current
lexer to allow that on demand, but it could be possible using a more
modern lexer tools. I am not sure how flexible the hhvm lexer is or if
we should have yet another language (as I would rather use plain PHP
in this case, even if it makes the task slightly harder to implement
or generate slightly bigger native code due to type checking or
conversions).
PHP as the greatest-common-denominator makes sense to me too. Much as
I like the flexibility of user attributes in HHVM, the goal of this
API layer is to be engine agnostic, so syntax should fall into
php-langspec, deviating as little as possible.
-Sara
More than adding a new layer, I would love to see something similar to
hhvm or Zephir available by default. If C is used, then only the
relevant parts have to be implement by the developers, skipping all
the over complicated data mangling, swapping, exchange, etc between
userland and the engine. It is then relatively easy to end up
generating codes for either php, hhvm or any other platform. Using
builtin script (yes, in this case it could be a nice thing), it could
became a very nice way to develop php&co extensions.And this is one of the things I love about HHVM's current extension
API, so yeah I'd like to bring some elements of that in if possible.
At one extreme it means changes to the lexer/parser, and the other end
it might mean a combination of prepend files (which are pure PHP) and
either an interface-definition file (simplified version of PHP) or
something like PHP's current API "inject a function/method here, and
here's the signature". I imagine this being one of the longer
portions of our discussions since a lot of folks are going to have
opinions on what approach is better.
I have a working proof of concept using something like:
<?php
function u_foo() {
}
?>
---- native
void foo()
{
printf("Hello from native!");
}
?>
<?php
... some other codes.
?>
Need a refactoring and some sanity checks but the concept works. It
indeed relies on my patch to bundle the concatenated PHP scripts
sections as the builtin script, generating .c files from the other
parts as well as the config.* and other files.
Ideally I like to include that in pickle, so one can package a release
from the PHP files/repo to generate the resulting extension package.
I will continue later this week and push it to github as soon as I
have a very draft working version for someone else than me :)
For what I can imagine (I did not remotely try to implement it yet) is
to find a way to parse, say, a php script which include custom
sections for C (or C++) codes. We could use comments but I do not like
the idea, mainly because it will be tricky to have editors support :)That's ambitious, but not impossible. :)
I worked around that for now, could even work with 5.x if we expose
(internally) zend_execute_string :)
One problem, I do not think it is possible to customize the current
lexer to allow that on demand, but it could be possible using a more
modern lexer tools. I am not sure how flexible the hhvm lexer is or if
we should have yet another language (as I would rather use plain PHP
in this case, even if it makes the task slightly harder to implement
or generate slightly bigger native code due to type checking or
conversions).PHP as the greatest-common-denominator makes sense to me too. Much as
I like the flexibility of user attributes in HHVM, the goal of this
API layer is to be engine agnostic, so syntax should fall into
php-langspec, deviating as little as possible.
Full ack.
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
Ideally I like to include that in pickle, so one can package a release
from the PHP files/repo to generate the resulting extension package.
I need to be more clear here :)
I like to have this in pickle, so the same package can be used to
install extensions (from src, php+native) for hhvm and php using the
same command and release packages/git tags/etc
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
De : Pierre Joye [mailto:pierre.php@gmail.com]
I have a working proof of concept using something like:
<?php
function u_foo() {
}
?>
---- native
void foo()
{
printf("Hello from native!");
}
?>
<?php
... some other codes.
?>
In your example, I don't see how you execute the foo() C function from PHP.
Would it be possible to embed C code in a PHP function or method ?
If yes, how do both codes access each other's data ? If foo() takes arguments, how would they be transmitted from PHP ? What about scalar type conversions to fit the expected C argument types ? The same about return value.
To summarize, I'd like to know how PHP and C code can communicate together.
Regards
François
De : Pierre Joye [mailto:pierre.php@gmail.com]
I have a working proof of concept using something like:
<?php
function u_foo() {
}
?>
---- native
void foo()
{
printf("Hello from native!");
}
?>
<?php
... some other codes.
?>In your example, I don't see how you execute the foo() C function from
PHP.
PHP_FE/ME are created. I forgot to add foo (); in the example.
Would it be possible to embed C code in a PHP function or method ?
Yes, on the plan
If yes, how do both codes access each other's data ? If foo() takes
arguments, how would they be transmitted from PHP ? What about scalar type
conversions to fit the expected C argument types ? The same about return
value.
Zpp. And the C function signature for the return value.
Also I like to have the base working. The rest is a spec issue and the
implementation can change if necessary. :)
De : php@golemon.com [mailto:php@golemon.com] De la part de Sara Golemon
And this is one of the things I love about HHVM's current extension
API, so yeah I'd like to bring some elements of that in if possible.
At one extreme it means changes to the lexer/parser, and the other end
it might mean a combination of prepend files (which are pure PHP) and
either an interface-definition file (simplified version of PHP) or
something like PHP's current API "inject a function/method here, and
here's the signature".
I think it's time to define a scope for a first step, eg rules to know if a given if an extension
Is eligible to the tools we are planning in this first step. Some may consider it as a waste of time but, when I think we must know where to stop before starting anything.
If you agree, I propose this as a base to refine:
- Extension exposes only PHP functions and constants (no OO),
- Each PHP function argument must accept a scalar, an array, or both ('mixed' case).
- PHP functions return value type can be any scalar type or array.
- one or more PHP function arguments can be optional
- PHP function arguments can be passed by value or by reference.
This constraints should fit a lot of 'bridge-only' extensions. I don't include OO because, IMO, storing properties is going too far for a 1st step.
I am starting a prototype of an C-code extension generator, written in PHP, using high-level function definitions. For each exposed function, the high-level definitions mostly define input arguments and return value. The generator handles all aspects of argument parsing, going much further than the current parsing API. It also handles return values, keeps the C code away from zval manipulation.
The implementation will be split between a generic part, which will read and format input files in memory, and pluggable generators. Each generator will have the responsibility to generate everything needed by a specific PHP engine.
For instance, the generator for the PHP interpreter will generate everything needed to run phpize/configure/make.
I will tell you when I have a model and examples of metadata definition and input files, so that you can tell me if it fits your needs (as I don't know HHVM yet).
One more thing : Do you know which existing PECL extension we could use for a proof of concept ? This should comply with our scope (basically, expose functions and constants only) but a small one would be better. I thought about newt but it defines more than 200 functions...
I also realize that this stuff should probably go to an RFC now...
Regards
François
De : François Laupretre [mailto:francois@tekwire.net]
If you agree, I propose this as a base to refine:
- Extension exposes only PHP functions and constants (no OO),
- Each PHP function argument must accept a scalar, an array, or both ('mixed'
case).- one or more PHP function arguments can be optional
- PHP function arguments can be passed by value or by reference.
Adding these rules :
- Constant values must be static (cannot be computed during MINIT).
- During MINIT and MSHUTDOWN, no activity excepts defining/undefining functions and constants.
- No activity during RINIT/RSHUTDOWN.
And changing this one :
- PHP functions return value type can be any scalar type or array.
To:
PHP functions return value must be a fixed type (cannot return different types depending on context). Return value type can be null, bool, long, double, string, or array.
The resulting ruleset :
- Extension exposes only PHP functions and constants (no OO),
- Each PHP function argument must accept a scalar, an array, or both ('mixed' case).
- one or more PHP function arguments can be optional
- PHP function arguments can be passed by value or by reference.
- PHP functions return value must be a fixed type (cannot return different types depending on context). Return value type can be null, bool, long, double, string, or array.
- Constant value must be static (cannot be computed during MINIT).
I have pushed a first half-baked version of the extension generator I was thinking about. Just for a look, don't try to run it. I started with json and yaml support for metadata definition. We can easily add another supported syntax if needed but these should be enough. Look at https://github.com/flaupretre/php-ext-gen/tree/develop
François
De : François Laupretre [mailto:francois@tekwire.net]
If you agree, I propose this as a base to refine:
- Extension exposes only PHP functions and constants (no OO),
- Each PHP function argument must accept a scalar, an array, or both
('mixed'
case).- one or more PHP function arguments can be optional
- PHP function arguments can be passed by value or by reference.
Adding these rules :
- Constant values must be static (cannot be computed during MINIT).
- During MINIT and MSHUTDOWN, no activity excepts defining/undefining
functions and constants.- No activity during RINIT/RSHUTDOWN.
And changing this one :
- PHP functions return value type can be any scalar type or array.
To:
PHP functions return value must be a fixed type (cannot return different
types depending on context). Return value type can be null, bool, long,
double, string, or array.The resulting ruleset :
- Extension exposes only PHP functions and constants (no OO),
- Each PHP function argument must accept a scalar, an array, or both
('mixed' case).- one or more PHP function arguments can be optional
- PHP function arguments can be passed by value or by reference.
- PHP functions return value must be a fixed type (cannot return
different types depending on context). Return value type can be null, bool,
long, double, string, or array.- Constant value must be static (cannot be computed during MINIT).
I have pushed a first half-baked version of the extension generator I was
thinking about. Just for a look, don't try to run it. I started with json
and yaml support for metadata definition. We can easily add another
supported syntax if needed but these should be enough. Look at
https://github.com/flaupretre/php-ext-gen/tree/develop
Would it not be better to work on one only?
Also I am really not a fan of yaml&co to generate C code but having
critical parts in C and everything else in straight php :)
François
De : Pierre Joye [mailto:pierre.php@gmail.com]
Also I am really not a fan of yaml&co to generate C code but having critical parts in C and everything else in straight php :)
I don't use yaml to generate C, just to define metadata, like extension name, arguments/return value characteristics, etc. The glue C code is then generated from this metadata.
A constant definition, for instance, defines a name, a type, and a value. This definition will be given in yaml or json. That's all it is used for. The user writes the function body (between arg parsing and return value) in C.
PHP is fine but defining a tree of metadata in json or yaml is really easier than writing nested array statements.
Regards
François
On Sun, Jan 11, 2015 at 7:44 PM, François Laupretre
francois@tekwire.net wrote:
- Extension exposes only PHP functions and constants (no OO),
- Each PHP function argument must accept a scalar, an array, or both ('mixed' case).
- one or more PHP function arguments can be optional
- PHP function arguments can be passed by value or by reference.
- PHP functions return value must be a fixed type (cannot return different types depending on context). Return value type can be null, bool, long, double, string, or array.
- Constant value must be static (cannot be computed during MINIT).
Why so limited? Like, why disallow objects completely? That feels
like a big step backward for PHP. Any why not computed constants?
(Assuming they stay constant throughout the process lifetime)
I have pushed a first half-baked version of the extension generator I was thinking about.
Just for a look, don't try to run it. I started with json and yaml support for metadata definition.
We can easily add another supported syntax if needed but these should be enough.
Look at https://github.com/flaupretre/php-ext-gen/tree/develop
I'll be honest, I'm not a big fan of code-generation. I find it
introduces complexity and it's not needed here. Lots of languages
manage to put together a stable API without resorting to code
generators.
-Sara
De : php@golemon.com [mailto:php@golemon.com] De la part de Sara
Golemon
Why so limited? Like, why disallow objects completely? That feels
like a big step backward for PHP. Any why not computed constants?
(Assuming they stay constant throughout the process lifetime)
This is not what we will release, if we decide to do it one day. This scope only applies to a first prototype: just enough to evaluate the concept, but not too much to avoid spending too much time before having a working prototype. The same for you when you add the code to generate for HHVM. I think it's enough to agree on a concept.
I'll be honest, I'm not a big fan of code-generation. I find it
introduces complexity and it's not needed here. Lots of languages
manage to put together a stable API without resorting to code
generators.
I am not crazy about it either, I just want to generate code for argument parsing (removing every access to zval, dealing with 'mixed' type, checks...), return value (convert to zval, maybe a pair of checks), and the global 'glue' code (global header file, function/constant declare/undeclare, all the boring stuff). Some of this is too complex for C macros. The rest is provided by the user in C and stays in C. This code will use C macros to access what we want to abstract.
Regards
François
On Mon, Jan 12, 2015 at 4:58 AM, François Laupretre
francois@tekwire.net wrote:
I am not crazy about it either, I just want to generate code for
argument parsing (removing every access to zval, dealing with
'mixed' type, checks...), return value (convert to zval, maybe a
pair of checks), and the global 'glue' code (global header file,
function/constant declare/undeclare, all the boring stuff).
Some of this is too complex for C macros. The rest is provided
by the user in C and stays in C. This code will use C macros to
access what we want to abstract.
Okay, but why even generate code for argument parsing? Why not just
pass the arguments/return-values as their concrete type?
For the record, I'm with you on not having anyone access zvals
directly, or having them deal with zpp, but that doesn't necessitate
generated code.
PHP is fine but defining a tree of metadata in json or yaml is
really easier than writing nested array statements.
Could you explain what you mean by this? I'm not sure where nested
arrays come from.
For example, the PHP code could be something as simple as the
following (but see below, I don't think this is necessarily a good
idea):
function NATIVE_STRING_foo(int $bar, float $baz = 3.14) {}
Although scalar type hints aren't supported, we can actually fake them
in the parser letting them look like class type hints, then converting
them post-parse. Meanwhile, a function with an empty body named
"NATIVE_T_*()" indicates its a stub, and what type it returns. Ugly,
yes. But compatible with all PHP parsers 5 and later.
Then in the C/C++ side, we present:
php_string foo(int bar, double baz) {
/* Normal(ish) C code goes here */
}
All that said, I think we can do better than fake typehints and a
kludgy return type, but my point is that we don't need to invent a new
format just to declare functions since we had this nice language
called PHP which already knows what PHP functions look like.
-Sara
On Mon, Jan 12, 2015 at 4:58 AM, François Laupretre
francois@tekwire.net wrote:I am not crazy about it either, I just want to generate code for
argument parsing (removing every access to zval, dealing with
'mixed' type, checks...), return value (convert to zval, maybe a
pair of checks), and the global 'glue' code (global header file,
function/constant declare/undeclare, all the boring stuff).
Some of this is too complex for C macros. The rest is provided
by the user in C and stays in C. This code will use C macros to
access what we want to abstract.Okay, but why even generate code for argument parsing? Why not just
pass the arguments/return-values as their concrete type?For the record, I'm with you on not having anyone access zvals
directly, or having them deal with zpp, but that doesn't necessitate
generated code.PHP is fine but defining a tree of metadata in json or yaml is
really easier than writing nested array statements.Could you explain what you mean by this? I'm not sure where nested
arrays come from.For example, the PHP code could be something as simple as the
following (but see below, I don't think this is necessarily a good
idea):function NATIVE_STRING_foo(int $bar, float $baz = 3.14) {}
Although scalar type hints aren't supported, we can actually fake them
in the parser letting them look like class type hints, then converting
them post-parse. Meanwhile, a function with an empty body named
"NATIVE_T_*()" indicates its a stub, and what type it returns. Ugly,
yes. But compatible with all PHP parsers 5 and later.Then in the C/C++ side, we present:
php_string foo(int bar, double baz) {
/* Normal(ish) C code goes here */
}
I must be something but I do not know how it can be exposed to
userland like that without the execute_data part.
By the way, we may finally have a good usage for the new args macros ;)
All that said, I think we can do better than fake typehints and a
kludgy return type, but my point is that we don't need to invent a new
format just to declare functions since we had this nice language
called PHP which already knows what PHP functions look like.
Yes, only not sure we can do it for 5.x and 7.x. I mean the native
parts, if we manage to provide new APIs (maybe backport could be
allowed or via an "extension" that simply gets bundled).
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
De : php@golemon.com [mailto:php@golemon.com] De la part de Sara Golemon
Okay, but why even generate code for argument parsing? Why not just
pass the arguments/return-values as their concrete type?
Because I am beyond what macros can do. Actually, I don't see how I can do what I have in mind with macros. And, IMHO, it is much more user-friendly. What I am trying to implement is a tool which requires as few knowledge about PHP internals as possible. I want an average C programmer to be able to connect a C library and build the whole extension in a few hours, even if he never did it before. And the generated extension should work on every PHP flavor supported by the tool.
That's why I am favoring a synthetic definition of the API using a markup language.
If you have a few minutes, I'd like you to read the doc I have written :
https://github.com/flaupretre/php-ext-gen/blob/master/doc/schema.yml is the metadata schema.
https://github.com/flaupretre/php-ext-gen/blob/master/doc/variables.md is an overview of PHP/C variable mapping.
You may also have a look at the 'examples' subdir. Two extensions are defined there. They don't expose much but they are ready for code generation.
The most complex case I have found is a mixed array/string argument, passed by ref, receiving an array, and function wants to return a string. Doing this without the developer even knowing that zvals exist is quite challenging. I may be wrong but I think it cannot be done with C macros or with an intermediate bidirectional conversion layer.
Could you explain what you mean by this? I'm not sure where nested
arrays come from.
I think we are not talking about the same mechanisms. What I am talking about is the way to describe an extension, so that a tool can take this description and generate the extension for different flavors of PHP engine. As this description is a tree and you said it can be done in PHP, I imagined that you wanted the user to write the metadata tree in PHP. But you probaly have something completely different in mind. It seems that, starting with the same needs, we are designing completely different solutions.
For example, the PHP code could be something as simple as the
following (but see below, I don't think this is necessarily a good
idea):function NATIVE_STRING_foo(int $bar, float $baz = 3.14) {}
Although scalar type hints aren't supported, we can actually fake them
in the parser letting them look like class type hints, then converting
them post-parse. Meanwhile, a function with an empty body named
"NATIVE_T_*()" indicates its a stub, and what type it returns. Ugly,
yes. But compatible with all PHP parsers 5 and later.Then in the C/C++ side, we present:
php_string foo(int bar, double baz) {
/* Normal(ish) C code goes here */
}
Mmh, why not... I agree that, for int and double, it is perfect. For bool too :) But how do you handle passing args by ref, mix-typed args (especially when passed by ref), nullok, multi-typed return value (the classical 'string or false' for instance), optional args (can be passed by ref too) ? And passing arrays in and out ? I don't understand how such a model can fit our needs.
Regards
François
My idea is to cover most (but not all) extensions with a narrower,
simplified API. As you say, many interactions fall into the "marshal
this engine value to a C type as needed". This can cater to the
bread-and-butter of PHP extensions very well.
Having a rather large reliance on Firebird, the fact that it gets pushed
to the 'someone else will do that when they need it' pile is a little
irritating. All the extension does is interfaces the Firebird Client to
PHP and the majority of the times it has become unavailable are due to
framework changes. Other extensions are currently in the same state on
phpng? All database related.
HHVM has only recently started supporting the interbase extension, but I
think does not have the fbird_ aliases which while there is still no
break between firebird and interbase, this is an area which will change
with the next generation of Firebird.
Alright we do have a couple of people who understand the PHP side enough
to address the very small number of bugs that have affected the
extension but it does need a programmer who understands the fundamental
changes being made that is needed to port the still missing extensions :(
I know you are only talking about the mechanism Sara, but some of us are
still unable to even play with the current master on our framework
because only 'most' extensions are currently supported.
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
My idea is to cover most (but not all) extensions with a narrower,
simplified API. As you say, many interactions fall into the "marshal
this engine value to a C type as needed". This can cater to the
bread-and-butter of PHP extensions very well.Having a rather large reliance on Firebird, the fact that it gets pushed
to the 'someone else will do that when they need it' pile is a little
irritating. All the extension does is interfaces the Firebird Client to
PHP and the majority of the times it has become unavailable are due to
framework changes. Other extensions are currently in the same state on
phpng? All database related.
I think I may have been unclear.
I don't mean that "those exceptional extensions should have to extend
the API themselves", I mean that not everything needs to be
expressible in a common language. I picked out runkit as an example
to illustrate this because it does /weird/ things. Database
connectors like ibase/firebird/etc... don't generally do those things.
The goal of this project is, in fact, to stop breaking simple
extensions like firebird/ibase who's only purpose is to bridge PHP to
C/C++. The framework they depend on should be solid enough that an
extension written for PHP 5.2 will work without having to make changes
on PHP7, 8, 15, etc... Depending on how we do it, we might even be
able to make it ABI compatible between versions, meaning you don't
even need to recompile between 5.2 and 7.0 (yes, that's a stretch
goal).
HHVM has only recently started supporting the interbase extension, but I
think does not have the fbird_ aliases which while there is still no
break between firebird and interbase, this is an area which will change
with the next generation of Firebird.
As an aside, if you've got a list of what needs aliasing, I can add
that this weekend. :) In HHVM, aliases a method literally looks
like:
function intlcal_get_repeated_wall_time_option(IntlCalendar $cal): int
{ return $cal->getRepeatedWallTimeOption(); }
Alright we do have a couple of people who understand the PHP side enough
to address the very small number of bugs that have affected the
extension but it does need a programmer who understands the fundamental
changes being made that is needed to port the still missing extensions :(
Again, the goal here is to NOT need those experts involved. An
extension written for 5.2 shouldn't have to care about changes in the
engine, it should only care that "I expect these C types when my API
is called, and I'm returning this C type". I don't want you to have
to know that zvals no longer store reference sementics directly. I
don't want you to have to deal with refcounting. I don't want you to
have to know when to separate and when not to just so that you can
convert some mixed type.
I know you are only talking about the mechanism Sara, but some of us are
still unable to even play with the current master on our framework
because only 'most' extensions are currently supported.
I know. I'm one of 'em. This is partially coming out of my
frustrated attempts to #ifdef my way around making my extensions work
on PHP5 and PHP7 with the knowledge that the API is still shifting
under my feet. That's why I think it's time for a stable public API
which is independent from the engine.
-Sara
While I could work on this in the dark, manipulating HHVM's APIs with
one hand and adding proxy interfaces to PHP (as an extension) with the
other, I'd much rather have involvement from others.
Apropos of collaborating, who's going to be around FOSDEM/PHPBenelux
that would like to come up with a plan for this project? Reply where
you'll be and whenish you're free (offlist if you'd prefer).
Failing either of those venues, how about a Hangout or similar to hash
out goals and ideas at a bit better bandwidth? Reply with best
dates/times for this latter option.
-Sara
While I could work on this in the dark, manipulating HHVM's APIs with
one hand and adding proxy interfaces to PHP (as an extension) with the
other, I'd much rather have involvement from others.Apropos of collaborating, who's going to be around FOSDEM/PHPBenelux
that would like to come up with a plan for this project? Reply where
you'll be and whenish you're free (offlist if you'd prefer).Failing either of those venues, how about a Hangout or similar to hash
out goals and ideas at a bit better bandwidth? Reply with best
dates/times for this latter option.
I am all for a hangout in any case. Best way to reach and have everyone :)
Cheers,
Pierre
@pierrejoye | http://www.libgd.org
De : Pierre Joye [mailto:pierre.php@gmail.com]
Apropos of collaborating, who's going to be around FOSDEM/PHPBenelux
that would like to come up with a plan for this project?
I'd love to but I can't afford. I am not a PHP professional, I would have to pay it by myself.
But I am OK for anything I can participate from home. I am not sure my English is fluent enough but I can try.
Failing either of those venues, how about a Hangout or similar to hash
out goals and ideas at a bit better bandwidth? Reply with best
dates/times for this latter option.I am all for a hangout in any case. Best way to reach and have everyone :)
Sorry, I don't know what a 'hangout' is, as I am quite bad with communication tools in general ? Google says it's a kind of chat on google+. Is it what you mean ?
Regards
François
On Mon, Jan 12, 2015 at 6:38 PM, François Laupretre
francois@tekwire.net wrote:
De : Pierre Joye [mailto:pierre.php@gmail.com]
Apropos of collaborating, who's going to be around FOSDEM/PHPBenelux
that would like to come up with a plan for this project?I'd love to but I can't afford. I am not a PHP professional, I would have to pay it by myself.
But I am OK for anything I can participate from home. I am not sure my English is fluent enough but I can try.
Failing either of those venues, how about a Hangout or similar to hash
out goals and ideas at a bit better bandwidth? Reply with best
dates/times for this latter option.I am all for a hangout in any case. Best way to reach and have everyone :)
Sorry, I don't know what a 'hangout' is, as I am quite bad with communication tools in general ? Google says it's a kind of chat on google+. Is it what you mean ?
known as googlechat in the past, google for it :)
--
Pierre
@pierrejoye | http://www.libgd.org
On Mon, Jan 12, 2015 at 6:38 PM, François Laupretre
francois@tekwire.net wrote:
Failing either of those venues, how about a Hangout or similar to hash
out goals and ideas at a bit better bandwidth? Reply with best
dates/times for this latter option.Sorry, I don't know what a 'hangout' is, as I am quite bad with communication tools
in general ? Google says it's a kind of chat on google+. Is it what you mean ?
Yeah, it's an audio/video group conferencing type thing. Thinking to
use that for the initial kickoff so that we can pick a general
direction (or a few options) to build from.
-Sara
De : php@golemon.com [mailto:php@golemon.com] De la part de Sara
GolemonYeah, it's an audio/video group conferencing type thing. Thinking to
use that for the initial kickoff so that we can pick a general
direction (or a few options) to build from.
OK. I am available everyday, preferably week days, any time from noon to midnight (french time).
But I am afraid I don't have enough material to have an opinion on what you're planning. Could you just write something slightly more elaborate ? a scenario, even partial, because, if I don't understand where you are going, I can't help much.
Cheers
François
The fact that hhvm implements a significant part of the extensions (or
other areas) using PHP+additional syntax as well as adding cleaner
APIs or mechanisms for the C parts only confirms me one thing: the
very 1st problem we have to solve is to ease the extension creation,
by drastically changing the internals APIs & tools. Bundling script
does not help here, we are using a scotch tape to repair something
that should have been replaced or redesigned since long already. I am
not blaming anyone, the engine design, historically, does not make
such changes easy.
Funny to see you mention this. I literally just pulled together a
meeting today to discuss HHVM's admittedly unstable extension API.
One idea to emerge from this was to design a new extension API
agnostic of underlying implementation. An API which, if done right,
could be adapted equally to both PHP7 and HHVM and provide the
opportunity to introduce PHP5 shims so that a single extension could
be written to function identically under any PHP runtime, and any
version. If done right, could make extensions not just source
compatible, but binary compatible as well. The engine details can
change, but the public-facing extension API could offer a consistent
way of doing the one and only thing that extensions should have to do:
Glue PHP into external libraries.That's a bit pie in the sky, I'll admit, but wouldn't that be cool?
Fact is, JNI does this for Java already, so there's precedence to
learn from. Heck, we're actually halfway there with HHVM's
"ext_zend_compat" layer, which makes PHP extensions (mostly) source
compatible with HHVM.
I was a very heavy user of JNI. Sucked big time. You pay a high price for trying to keep a consistent API and marshaling. While this is slightly different I don't see how you avoid some of the additional overhead plus it will be very challenging to really cover everything that's needed.
Just my 2 cents. Had to respond because of how crappy JNI was/is :)
Andi
While I could work on this in the dark, manipulating HHVM's APIs with
one hand and adding proxy interfaces to PHP (as an extension) with the
other, I'd much rather have involvement from others.What do you think?
-Sara
The fact that hhvm implements a significant part of the extensions (or
other areas) using PHP+additional syntax as well as adding cleaner
APIs or mechanisms for the C parts only confirms me one thing: the
very 1st problem we have to solve is to ease the extension creation,
by drastically changing the internals APIs & tools. Bundling script
does not help here, we are using a scotch tape to repair something
that should have been replaced or redesigned since long already. I am
not blaming anyone, the engine design, historically, does not make
such changes easy.
Funny to see you mention this. I literally just pulled together a
meeting today to discuss HHVM's admittedly unstable extension API.
One idea to emerge from this was to design a new extension API
agnostic of underlying implementation. An API which, if done right,
could be adapted equally to both PHP7 and HHVM and provide the
opportunity to introduce PHP5 shims so that a single extension could
be written to function identically under any PHP runtime, and any
version. If done right, could make extensions not just source
compatible, but binary compatible as well. The engine details can
change, but the public-facing extension API could offer a consistent
way of doing the one and only thing that extensions should have to do:
Glue PHP into external libraries.That's a bit pie in the sky, I'll admit, but wouldn't that be cool?
Fact is, JNI does this for Java already, so there's precedence to
learn from. Heck, we're actually halfway there with HHVM's
"ext_zend_compat" layer, which makes PHP extensions (mostly) source
compatible with HHVM.I was a very heavy user of JNI. Sucked big time. You pay a high price for trying to keep a consistent API and marshaling. While this is slightly different I don't see how you avoid some of the additional overhead plus it will be very challenging to really cover everything that's needed.
Just my 2 cents. Had to respond because of how crappy JNI was/is :)
I fully agree. JNI is crap.
However, let face it, our internals APIs is really not good. Yes,
there were/are on going efforts for small parts but overall it is a
pain to achieve simple things.
--
Pierre
@pierrejoye | http://www.libgd.org
Hi,
Here's a function definition :
#PHP proto: int newt_centered_window ( int $width , int $height [, string $title=null ] )
return_type: int
arguments:
width:
type: int
height:
type: int
title:
type: string
nullok: true
{% block body %}
if (title_is_null) title=NULL;
retval=newtCenteredWindow((long)width, (long)height, title);
{% endblock %}
More at: https://github.com/flaupretre/php-ext-gen
Any thought ?
François