[RFC] Extension Prepend Files

10 years ago by reeze — view source

unread

I like the idea, I have implemented a util [1] to help writing extensions
with PHP, I try the approach HHVM adopted by embedding PHP scripts to the
binary file of extension, maybe it is what you want. I do like it been
supported in core. I would like to implement the RFC if others like it.

[1] https://github.com/reeze/php-ext-embed

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

greetings
Benjamin

10 years ago by Benjamin Eberlei — view source

unread

Hey reeze,

This looks like a fantastic approach. Can you explain how you compile the
PHP code into the Shared Object? The README doesnt explain much.

greetings
Benjamin

I like the idea, I have implemented a util [1] to help writing extensions
with PHP, I try the approach HHVM adopted by embedding PHP scripts to the
binary file of extension, maybe it is what you want. I do like it been
supported in core. I would like to implement the RFC if others like it.

[1] https://github.com/reeze/php-ext-embed

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of
extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that
are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

greetings
Benjamin

10 years ago by reeze — view source

unread

Hi Benjanmin,
I didn't try to cache opcode myself, I try it before, but I think it
should be done by opcaches (avoid duplication and opcache will optimize the
compiled opcode), so I use the method @François mentioned, I use a stream
wrapper to take advantage of opcaches. there is a problem opcache didn't
allow streams except file:// and phar://, @François created a RFC to
support it [1], if that was accepted, it should work with full speed :)

--
[1] https://github.com/php/php-src/pull/976/files

Hey reeze,

This looks like a fantastic approach. Can you explain how you compile the
PHP code into the Shared Object? The README doesnt explain much.

greetings
Benjamin

I like the idea, I have implemented a util [1] to help writing extensions
with PHP, I try the approach HHVM adopted by embedding PHP scripts to the
binary file of extension, maybe it is what you want. I do like it been
supported in core. I would like to implement the RFC if others like it.

[1] https://github.com/reeze/php-ext-embed

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of
extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that
are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up
to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

greetings
Benjamin

10 years ago by francois@tekwire.net — view source

unread

De : Benjamin Eberlei [mailto:kontakt@beberlei.de]
I want to open discussion on my RFC to strengthen the ability of extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

Can you please develop the API changes in your RFC ? You're talking about changes in zend_execute_scripts(), a new zend_execute_script() function (but the name in the prototype is wrong), then php_execute_scripts()...

Where are the file handles coming from ? Does it mean that the files to prepend will be kept open ?

Do you register paths to prepend, or will the scripts be loaded in memory during MINIT ?

How will these files be cached by opcode caches (mandatory for such a feature) ?

And, finally, how can an extension determine where a given PHP script it requires has been installed ? It makes sense in php.ini because it is under control of the final user. But an extension is just a C library, installed anywhere or bundled in the PHP executable. Would it compute the script paths from PHP installation paths ?

Actually, I don't see how it would work with PHP scripts which would remain separate from the extension code. What I would imagine for such a feature, would be PHP code embedded in the extension as a memory buffer, registered during MINIT, and then executed at each RINIT. This would probably require a stream wrapper because the opcode cache would require paths for this code. We can also concatenate the scripts but we still need a virtual path. The stream wrapper would just be simpler to implement.

Regards

François

10 years ago by Benjamin Eberlei — view source

unread

On Sun, Jan 4, 2015 at 1:36 PM, François Laupretre francois@tekwire.net
wrote:

De : Benjamin Eberlei [mailto:kontakt@beberlei.de]
I want to open discussion on my RFC to strengthen the ability of
extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that
are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up
to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

Can you please develop the API changes in your RFC ? You're talking about
changes in zend_execute_scripts(), a new zend_execute_script() function
(but the name in the prototype is wrong), then php_execute_scripts()...

Fixed the typo in zend_execute_script. The API of php_execute_scripts
doesnt change, only the implementation.

Where are the file handles coming from ? Does it mean that the files to
prepend will be kept open ?

File handles have existed before in php_execute_scripts, they were passed
via the va_list into zend_execute_scripts.

Do you register paths to prepend, or will the scripts be loaded in memory
during MINIT ?

You register file paths to prepend.

How will these files be cached by opcode caches (mandatory for such a
feature) ?

Files will be cached, because opcache hooks into zend_compile_file, which
is used downstream of zend_execute_scripts. That means the feature would
autoamtically use and work with opcache.

And, finally, how can an extension determine where a given PHP script it
requires has been installed ? It makes sense in php.ini because it is under
control of the final user. But an extension is just a C library, installed
anywhere or bundled in the PHP executable. Would it compute the script
paths from PHP installation paths ?

Files must be either in the PHP include path, or via absolute path, for
example by putting them right next to the shared object (.so) files and
using the extension_dir path inside the code, see the following for an
example (a hacked approach):
https://github.com/QafooLabs/php-profiler-extension/blob/master/qafooprofiler.c#L744

Actually, I don't see how it would work with PHP scripts which would
remain separate from the extension code. What I would imagine for such a
feature, would be PHP code embedded in the extension as a memory buffer,
registered during MINIT, and then executed at each RINIT. This would
probably require a stream wrapper because the opcode cache would require
paths for this code. We can also concatenate the scripts but we still need
a virtual path. The stream wrapper would just be simpler to implement.

This uses the usual "require" functionality that you can call from PHP as
well. If the files are not found, then you get an error.

Regards

François

10 years ago by Rowan Collins — view source

unread

Benjamin Eberlei wrote on 04/01/2015 13:00:

Files must be either in the PHP include path, or via absolute path, for
example by putting them right next to the shared object (.so) files and
using the extension_dir path inside the code, see the following for an
example (a hacked approach):
https://github.com/QafooLabs/php-profiler-extension/blob/master/qafooprofiler.c#L744

Beware that anything that looks in a configurable or
environment-dependent list of include directories has the potential to
open up security holes - anyone who can change that setting, or who has
write access to a directory with higher priority, can inject code in a
very broad context. It can also lead to surprising side effects if a
file with the same name is coincidentally placed into more than one
directory in the search path.

I think if the intention is for extensions to ship these files directly,
the registration mechanism should create an absolute path based either
on a single configured directory, which distributors will set based on
their installation paths, or on the directory containing the .so file
being loaded.

Regards,

Rowan Collins
[IMSoP]

10 years ago by Pierre Joye — view source

unread

Hi,

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of
extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that
are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

I understand the idea however I wonder what are the gains for the user? On
any case the file has to be at the right place, etc.

Also as it is internals only, it could be nicer to expose the prepend
config using the existing directive with options like first, before, last
for the insert position.

While being at it, a similar feature can be added to auto prepend/append as
it can be very useful in user land as well.

Cheers,
Pierre

10 years ago by Benjamin Eberlei — view source

unread

Hi,

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of
extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that
are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up
to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

I understand the idea however I wonder what are the gains for the user? On
any case the file has to be at the right place, etc.

The shared object file has to be at the right place as well.

Also as it is internals only, it could be nicer to expose the prepend
config using the existing directive with options like first, before, last
for the insert position.

This assumes that we have dependencies between extensions, which we don't
have anyways. Now if extensions depend on each other the order is
important. I assume that the loading order is also the order that the MINIT
is called, which then would also put dependent php files in order.

While being at it, a similar feature can be added to auto prepend/append
as it can be very useful in user land as well.

Yes maybe, although I don't see the benefit for userland TBH, compared to
extensions. Extensions are about declaring functions/classes, prepend is
normally about actually "executing"/running code.

Cheers,
Pierre

10 years ago by Pierre Joye — view source

unread

Hi,

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of
extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files"
that are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely
up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

I understand the idea however I wonder what are the gains for the user?
On any case the file has to be at the right place, etc.

The shared object file has to be at the right place as well.

Make install, build dependent. Include paths are runtime dependent. It
makes a slight bigger difference.

Also as it is internals only, it could be nicer to expose the prepend
config using the existing directive with options like first, before, last
for the insert position.

This assumes that we have dependencies between extensions, which we don't
have anyways. Now if extensions depend on each other the order is
important. I assume that the loading order is also the order that the MINIT
is called, which then would also put dependent php files in order.

While being at it, a similar feature can be added to auto prepend/append
as it can be very useful in user land as well.

Yes maybe, although I don't see the benefit for userland TBH, compared to
extensions. Extensions are about declaring functions/classes, prepend is
normally about actually "executing"/running code.

Not always really. But the two are so similar than I have some hard time to
see why it should be different. Feel free to enlighten me :)

Cheers,
Pierre

10 years ago by Julien Pauli — view source

unread

On Sun, Jan 4, 2015 at 12:52 PM, Benjamin Eberlei kontakt@beberlei.de
wrote:

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

Hello.

Can't this be already done somehow ? In RINIT stage obviously.
I don't understand why to change our API to add a feature we already can
use ?

Julien.P

10 years ago by Rowan Collins — view source

unread

Julien Pauli wrote on 05/01/2015 16:19:

Hello.

Can't this be already done somehow ? In RINIT stage obviously.
I don't understand why to change our API to add a feature we already can
use ?

Julien.P

The RFC explains the motivation reasonably clearly. In particular:

Using RINIT is error prone and a little bit dangerous, because is a
bit too early for some data to be cleanly created (globals).

Basically, it's a standardised implementation, and allows the code to
run at a more appropriate phase of execution.

--
Rowan Collins
[IMSoP]

10 years ago by Julien Pauli — view source

unread

On Mon, Jan 5, 2015 at 5:26 PM, Rowan Collins rowan.collins@gmail.com
wrote:

Julien Pauli wrote on 05/01/2015 16:19:

Hello.

Can't this be already done somehow ? In RINIT stage obviously.
I don't understand why to change our API to add a feature we already can
use ?

Julien.P

The RFC explains the motivation reasonably clearly. In particular:

Using RINIT is error prone and a little bit dangerous, because is a bit
too early for some data to be cleanly created (globals).

This is a problem that can already be dealt with.
Globals are created before RINIT.

Basically, it's a standardised implementation, and allows the code to run
at a more appropriate phase of execution.

Yes, I understand the RFC as a standardisation of a concept that is very
uncommonly used, and that could lead to problems against OPcode caches.
Having extensions declaring php files to load, knowing we can dl() them at
runtime, we have static vs dynamic extensions... That smells problems to me
(at least its not as easy as the RFC actually states) , however, nothing is
impossible.

Julien.P

10 years ago by Sara Golemon — view source

unread

I want to open discussion on my RFC to strengthen the ability of extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

So, I've been meaning to propose something like this, but with a few
key implementation detail differences:

Create the notion of "Persistent User
Functions/Classes/Constants/etc...". This is an important perf item
as reloading a prepend file on EVERY request is costly. Less costly
with an opcache, sure, but still costly. Making the entries
persistent lets us deal with this once in the process lifetime and
keep the data around.
Embedded text sections. It's possible to place the raw PHP code
into the compiled .so/.dylib/.dll file and fetch it out for
compilation at runtime. This enables easy bundling of the loaded
scripts, obviates the need to track what directory the files are in,
and generally makes it cleaner. ((Pre-compiling to bytecode is an
option, but it complicates the reloading and doesn't really buy us
much.))

-Sara

10 years ago by Stanislav Malyshev — view source

unread

Hi!

Create the notion of "Persistent User
Functions/Classes/Constants/etc...". This is an important perf item
as reloading a prepend file on EVERY request is costly. Less costly
with an opcache, sure, but still costly. Making the entries
persistent lets us deal with this once in the process lifetime and
keep the data around.

That looks like much bigger can of worms. I.e. would these classes be
immutable? Would we ban functions/methods from having static vars? How
would we achieve that? etc. etc.

Embedded text sections. It's possible to place the raw PHP code
into the compiled .so/.dylib/.dll file and fetch it out for
compilation at runtime. This enables easy bundling of the loaded

I guess it is possible, but why - what's wrong with plain old files and
phars?

scripts, obviates the need to track what directory the files are in,

You still need to address the files somehow if you plan to have more
than one, which would eventually lead you to the need to namespace it
since people tend to name their files "utils.php". Which would be
essentially the same as directories. So I'm not sure I understand the
win here.

Stas Malyshev
smalyshev@gmail.com

10 years ago by Derick Rethans — view source

unread

Embedded text sections. It's possible to place the raw PHP code
into the compiled .so/.dylib/.dll file and fetch it out for
compilation at runtime. This enables easy bundling of the loaded

I guess it is possible, but why - what's wrong with plain old files and
phars?

Deployment and installation. Right now, "pecl install extension" doesn't
really allow you to also install and load a PHP script on every
request. Such a PHP script could define extra classes, that are written
in PHP - because maintainting them as a C implementation would be way
more work. Having users install another set of PHP files for every
project is error prone, and frankly, bad for user experience. I can
definitely see the use case here, and it's probably something we'd want
to use for the new MongoDB driver.

cheers,
Derick

10 years ago by Benjamin Eberlei — view source

unread

Embedded text sections. It's possible to place the raw PHP code
into the compiled .so/.dylib/.dll file and fetch it out for
compilation at runtime. This enables easy bundling of the loaded

I guess it is possible, but why - what's wrong with plain old files and
phars?

Deployment and installation. Right now, "pecl install extension" doesn't
really allow you to also install and load a PHP script on every
request. Such a PHP script could define extra classes, that are written
in PHP - because maintainting them as a C implementation would be way
more work. Having users install another set of PHP files for every
project is error prone, and frankly, bad for user experience. I can
definitely see the use case here, and it's probably something we'd want
to use for the new MongoDB driver.

Yes the deployment aspect isn't discussed in my RFC and that is an
important part.

Frankly most of the ideas presented here are way better than the hack I
have come up with, the question is how to proceed to refine this idea?

The general plan seems to be to have a way to embed PHP code into the
shared object, for example using Autoconf Macros (like the embed extension
mentioned above does)

cheers,
Derick

10 years ago by Pierre Joye — view source

unread

The general plan seems to be to have a way to embed PHP code into the
shared object, for example using Autoconf Macros (like the embed extension
mentioned above does)

Just for the record, stated it before but... :)

I do think it is a very bad idea to do it. See my other replies for the reasons.

--
Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Pierre Joye — view source

unread

Embedded text sections. It's possible to place the raw PHP code
into the compiled .so/.dylib/.dll file and fetch it out for
compilation at runtime. This enables easy bundling of the loaded

I guess it is possible, but why - what's wrong with plain old files and
phars?

Deployment and installation. Right now, "pecl install extension" doesn't
really allow you to also install and load a PHP script on every
request. Such a PHP script could define extra classes, that are written
in PHP - because maintainting them as a C implementation would be way
more work. Having users install another set of PHP files for every
project is error prone, and frankly, bad for user experience. I can
definitely see the use case here, and it's probably something we'd want
to use for the new MongoDB driver.

The more I read the replies here the more I think we have two issues
two solves. That does not mean this proposal (how it will be done
seems to be very open at this point) is a bad idea.

Current internals APIs are not good enough for easy to implement
and maintain exposed classes (declaration, implementation, etc.)
We have no easy way to actually release and deploy adhoc scripts,
used by a given extension

For 1., as stated earlier, we really need to work on that. The sooner
the better.

For 2., it is one of the thing I can imagine implementing in pickle.
Or even better add it a s part of the build scripts and macros. Either
will work, even for binary install on windows f.e.. I would really
like to define a clear location relative to the extension directory.
This location will be used by this new feature, if implemented. I also
think that it should be either part of the current prepend setting, by
default and system only, or another setting. What I really consider as
a bad practice is any kind of bundling scripts in the extensions, that
will be a real pain to maintain and it opens some unknown can of
worms. Distros may also prefer to have that script outside the
extension, easier for cherry picks update when necessary (f.e.
security fixes).

Please take it as a brainstorming, thoughts?

10 years ago by francois@tekwire.net — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

We have no easy way to actually release and deploy adhoc scripts,
used by a given extension

For 2., it is one of the thing I can imagine implementing in pickle.
Or even better add it a s part of the build scripts and macros. Either
will work, even for binary install on windows f.e.. I would really
like to define a clear location relative to the extension directory.
This location will be used by this new feature, if implemented. I also
think that it should be either part of the current prepend setting, by
default and system only, or another setting. What I really consider as
a bad practice is any kind of bundling scripts in the extensions, that
will be a real pain to maintain and it opens some unknown can of
worms. Distros may also prefer to have that script outside the
extension, easier for cherry picks update when necessary (f.e.
security fixes).

I am sorry to say that but, with all due respect, I have a totally opposite opinion.

IMHO, this PHP code (and any other data we want to embed) is an integral part of the extension. A compiled extension is a consistent piece of code, it is contained in a single file and should not be split to separate files.

The first reason is that, if we consider a released extension as a conceptual unit, the best way to protect its integrity is to store it as a single file. Storing it as separate files brings a lot of potential issues : files can be renamed, deleted, etc. Offline tools like composer can take care of integrity but, from a final user pov, it will never be as clear as a single file (.so for extensions, or package for PHP software).

The second reason, more important, is about inter-communication between C and PHP. I understand the pov : it is easier to allow distros to upgrade an extension's PHP code without upgrading the C code. But, if we consider the PHP code as an integral part of the extension, this should be avoided, as C and PHP code need to be kept in sync.

If you allow upgrading the PHP code only, you must consider them as two separate pieces of code and, so, consider BC in the way they communicate together. This is a perfectly valid scheme, but, in this case, this PHP code is not part of your extension anymore, it is a separate software with its own version/release numbering, which communicates with your extension. If distros could cherry-pick the C files they want to upgrade, depending on the features they want to provide, would you be ready to keep BC between all your C files ? The same if libc was distributed as a bunch of individually-upgradable files, each defining a function, with consistency ensured by metadata and an offline tool.

So, to summarize my opinion : if an extension writer wants to distribute part of his code in PHP and if he wants this code to be upgradeable in an independent way, he must distribute them as an independent software package, not as part of the extension. The whole code to consider as part of an extension, whatever the language, must be kept in sync. And the best way to achieve this is to store it in a single file.

Regards,

François

10 years ago by Pierre Joye — view source

unread

The whole code to consider as part of an extension, whatever the language,
must be kept in sync. And the best way to achieve this is to store it in a
single file.

I would rather say in a single release and package. Built in scripts in an
extension is an extremely bad step, maintenance, issues management and all
possible complications to allow that may bring. But we get each other view.
I only hope we will not do that.

Also interesting to ignore the other questions, which are imho very
critical. :)

Regards,

François

10 years ago by Benjamin Eberlei — view source

unread

The whole code to consider as part of an extension, whatever the language,
must be kept in sync. And the best way to achieve this is to store it in a
single file.

I would rather say in a single release and package. Built in scripts in an
extension is an extremely bad step, maintenance, issues management and all
possible complications to allow that may bring. But we get each other view.
I only hope we will not do that.

Also interesting to ignore the other questions, which are imho very
critical. :)

How is shipping php code inside so/ddl worse than shipping just the C code
this way? I agree with you it makes changing the PHP code more difficult
(requires recompile), but it makes the deployment and build much simpler.
Including the security concerns.

Regards,

François

10 years ago by Pierre Joye — view source

unread

On Jan 6, 2015 9:20 PM, "François Laupretre" francois@tekwire.net
wrote:
The whole code to consider as part of an extension, whatever the
language, must be kept in sync. And the best way to achieve this is to
store it in a single file.

I would rather say in a single release and package. Built in scripts in
an extension is an extremely bad step, maintenance, issues management and
all possible complications to allow that may bring. But we get each other
view. I only hope we will not do that.

Also interesting to ignore the other questions, which are imho very
critical. :)

How is shipping php code inside so/ddl worse than shipping just the C
code this way? I agree with you it makes changing the PHP code more
difficult (requires recompile), but it makes the deployment and build much
simpler. Including the security concerns.

It does not make the builds simpler, in contrary. From an initial release
point of view, no change.

However for any other release, it changes a lot. The glue code may change
more often (it does for a few of my exts f.e.), updates process are then
totally different, from a packaging and deployment point of view.

From a core php point of view, it needs this new feature, I never had the
need of it while having done glue code as part of releases for quite some
time. It introduce something radically different and not as simple as it
may sound. I have doubt a about the gains given the costs.

Let alone the total lack of flexibility, be for the support (pls try that
patch) or customizing of the glue codes. Customizations can still be done
but the upstream versions will always be loaded, whatever it does (and
could prevent customization in some cases fe.)

I also wonder why php needs that while any other similar languages have
chosen to improve the packaging and default installation to drastically
ease pkg made of native and scripts codes (see python or Perl for example).

And a details but why it needs to be "bundled to be faster" but it is used
for non performance critical parts? While this argument may switch to
"easier to have all synced" :)

Remains the other points I mentioned, what's about making user exposed
classes&co from internal more friendly?

What about the alternative solution? Default location etc? Or are we going
to see again a RFC not taking of the Comments and go with a single option
vote? (Just to prevent that to happen again)

10 years ago by Stanislav Malyshev — view source

unread

Hi!

The first reason is that, if we consider a released extension as a
conceptual unit, the best way to protect its integrity is to store it
as a single file. Storing it as separate files brings a lot of
potential issues : files can be renamed, deleted, etc. Offline tools

Yet people have been releasing mulfi-file software packages for decades,
and it seems to work fine and the software world has yet to collapse
under these problems. Which suggests they maybe not as severe as they
seem to be.

code. But, if we consider the PHP code as an integral part of the
extension, this should be avoided, as C and PHP code need to be kept
in sync.

Again, there is a multitude of solutions for this, all in the realm of
packaging. It's not like we've just encountered the idea of software
package having more than one file.

Stas Malyshev
smalyshev@gmail.com

10 years ago by Derick Rethans — view source

unread

But, if we consider the PHP code as an integral part of the
extension, this should be avoided, as C and PHP code need to be kept
in sync.

Again, there is a multitude of solutions for this, all in the realm of
packaging. It's not like we've just encountered the idea of software
package having more than one file.

There is currently no way to install an extension and a PHP library
package at the same time. "pecl" can't install PHP libraries, and
"composer" can't install extensions. And even if it did, keeping the
versions in sync is not easy at all. Only way to solve this properly is
to allow PHP code embeded in the extension. Just like the couchdb ext
does, or Benjamin wants to do with the qafooprofiler, or HHVM does it
through HNI.

cheers,
Derick

10 years ago by Pierre Joye — view source

unread

But, if we consider the PHP code as an integral part of the
extension, this should be avoided, as C and PHP code need to be kept
in sync.

Again, there is a multitude of solutions for this, all in the realm of
packaging. It's not like we've just encountered the idea of software
package having more than one file.

There is currently no way to install an extension and a PHP library
package at the same time. "pecl" can't install PHP libraries, and
"composer" can't install extensions. And even if it did, keeping the
versions in sync is not easy at all. Only way to solve this properly is
to allow PHP code embeded in the extension. Just like the couchdb ext
does, or Benjamin wants to do with the qafooprofiler, or HHVM does it
through HNI.

Pickle integration is almost finished, that solves your worries about
composer and pecl extensions.

And if you ready other replies carefully I have proposed cleaner solution
(than bundling scripts).

And no, bundling code is not the easy way, it is a can of worms, to
implement, maintain and even worst, for distros.

10 years ago by Stanislav Malyshev — view source

unread

Hi!

There is currently no way to install an extension and a PHP library
package at the same time. "pecl" can't install PHP libraries, and

Why it needs to be "at the same time"? I don't see any use case where it
would matter if you run one command or two commands to install it. In
fact, if it's such a problem there are many tools that allow you to
perform multiple actions by running one command. Various package
managers all do that, for example.

"composer" can't install extensions. And even if it did, keeping the
versions in sync is not easy at all. Only way to solve this properly is

It seems to me you're reinventing packaging systems. I don't see why we
should invent our own and why our own should take form of putting PHP
code into compiled binaries (yet less why suddenly it is the "only
way"). Many languages have extension systems and packages that involve
binaries - Perl, Python, Ruby, etc. AFAIK none of them puts source code
into binaries.

--
Stas Malyshev
smalyshev@gmail.com

10 years ago by Pierre Joye — view source

unread

"composer" can't install extensions. And even if it did, keeping the
versions in sync is not easy at all. Only way to solve this properly is

It seems to me you're reinventing packaging systems. I don't see why we
should invent our own and why our own should take form of putting PHP
code into compiled binaries (yet less why suddenly it is the "only
way"). Many languages have extension systems and packages that involve
binaries - Perl, Python, Ruby, etc. AFAIK none of them puts source code
into binaries.

Composer inability to install extension is a long awaited feature.
Pickle (using composer.json spec format and meta available in the src
tree) fills this gap. In this specific point, it is not about
reinventing the wheel but about finishing it, or make it more round :)

I however fully agree with you about putting src code into binaries.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Derick Rethans — view source

unread

There is currently no way to install an extension and a PHP library
package at the same time. "pecl" can't install PHP libraries, and

Why it needs to be "at the same time"? I don't see any use case where
it would matter if you run one command or two commands to install it.

Well, I do. And it's not just installing two packages. It is also
adding extra lines to every script to load code that your extension
depends on. I would pick an "pecl install" over a "pecl install this",
"composer install that", "add a few lines to a script" as an
installation method anytime.

"composer" can't install extensions. And even if it did, keeping the
versions in sync is not easy at all. Only way to solve this properly
is

It seems to me you're reinventing packaging systems.

I want to solve the issue where I have a PHP library that is tied to C
code, without having to deal with random tools and depencies for no
reason. PHP packaging systems don't do that.

I don't see why we should invent our own and why our own should take
form of putting PHP code into compiled binaries (yet less why suddenly
it is the "only way"). Many languages have extension systems and
packages that involve binaries - Perl, Python, Ruby, etc. AFAIK none
of them puts source code into binaries.

Most of those languages don't depend as heavily on parts written in C
though. They will only break out to C for specific reasons. PHP
extensions are the other way around. It's almost always C, but some opt
to also use some PHP to make developement faster.

cheers,
Derick

10 years ago by Pierre Joye — view source

unread

There is currently no way to install an extension and a PHP library
package at the same time. "pecl" can't install PHP libraries, and

Why it needs to be "at the same time"? I don't see any use case where
it would matter if you run one command or two commands to install it.

Well, I do. And it's not just installing two packages. It is also
adding extra lines to every script to load code that your extension
depends on. I would pick an "pecl install" over a "pecl install this",
"composer install that", "add a few lines to a script" as an
installation method anytime.

It is not. pecl install can install anything you want already. Having
php scripts installed when it installs an extension. The only question
is where to put them. But I have said that many times in this thread
already.

"composer" can't install extensions. And even if it did, keeping the
versions in sync is not easy at all. Only way to solve this properly
is

It seems to me you're reinventing packaging systems.

I want to solve the issue where I have a PHP library that is tied to C
code, without having to deal with random tools and depencies for no
reason. PHP packaging systems don't do that.

It does.

I don't see why we should invent our own and why our own should take
form of putting PHP code into compiled binaries (yet less why suddenly
it is the "only way"). Many languages have extension systems and
packages that involve binaries - Perl, Python, Ruby, etc. AFAIK none
of them puts source code into binaries.

Most of those languages don't depend as heavily on parts written in C
though. They will only break out to C for specific reasons. PHP
extensions are the other way around. It's almost always C, but some opt
to also use some PHP to make developement faster.

And here we go again, showing that we actually try to tackle the wrong problem.

I have discussed with Laruence earlier and will make a patch + demo
extension to show my thoughts. It may end this circular discussion and
start one about concrete needs and implementations issues instead.

--
Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Derick Rethans — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

We have no easy way to actually release and deploy adhoc scripts,
used by a given extension

For 2., it is one of the thing I can imagine implementing in pickle.
Or even better add it a s part of the build scripts and macros. Either
will work, even for binary install on windows f.e.. I would really
like to define a clear location relative to the extension directory.
This location will be used by this new feature, if implemented. I also
think that it should be either part of the current prepend setting, by
default and system only, or another setting. What I really consider as
a bad practice is any kind of bundling scripts in the extensions, that
will be a real pain to maintain and it opens some unknown can of
worms. Distros may also prefer to have that script outside the
extension, easier for cherry picks update when necessary (f.e.
security fixes).

I am sorry to say that but, with all due respect, I have a totally opposite opinion.

IMHO, this PHP code (and any other data we want to embed) is an
integral part of the extension. A compiled extension is a consistent
piece of code, it is contained in a single file and should not be
split to separate files.

That's very much the case. One extension, one "install". It doesn't
matter whether some of the extension is written in C, and other parts in
PHP. HHVM is all about this. Making use of C where you need it, and
otherwise just write the simpler but integral border functionality in
PHP for faster maintenance and development.

The first reason is that, if we consider a released extension as a
conceptual unit, the best way to protect its integrity is to store it
as a single file. Storing it as separate files brings a lot of
potential issues : files can be renamed, deleted, etc. Offline tools
like composer can take care of integrity but, from a final user pov,
it will never be as clear as a single file (.so for extensions, or
package for PHP software).

+1

The second reason, more important, is about inter-communication
between C and PHP. I understand the pov : it is easier to allow
distros to upgrade an extension's PHP code without upgrading the C
code. But, if we consider the PHP code as an integral part of the
extension, this should be avoided, as C and PHP code need to be kept
in sync.

+1

If you allow upgrading the PHP code only, you must consider them as
two separate pieces of code and, so, consider BC in the way they
communicate together. This is a perfectly valid scheme, but, in this
case, this PHP code is not part of your extension anymore, it is a
separate software with its own version/release numbering, which
communicates with your extension. If distros could cherry-pick the C
files they want to upgrade, depending on the features they want to
provide, would you be ready to keep BC between all your C files ? The
same if libc was distributed as a bunch of individually-upgradable
files, each defining a function, with consistency ensured by metadata
and an offline tool.

So, to summarize my opinion : if an extension writer wants to
distribute part of his code in PHP and if he wants this code to be
upgradeable in an independent way, he must distribute them as an
independent software package, not as part of the extension. The whole
code to consider as part of an extension, whatever the language, must
be kept in sync. And the best way to achieve this is to store it in a
single file.

+1 — and the latter is what Benjamin was suggesting, albeit perhaps in
not the most technologically sound way.

cheers,
Derick

--
http://derickrethans.nl | http://xdebug.org
Like Xdebug? Consider a donation: http://xdebug.org/donate.php
twitter: @derickr and @xdebug
Posted with an email client that doesn't mangle email: alpine

10 years ago by Pierre Joye — view source

unread

Hi,

De : Pierre Joye [mailto:pierre.php@gmail.com]

We have no easy way to actually release and deploy adhoc scripts,
used by a given extension

For 2., it is one of the thing I can imagine implementing in pickle.
Or even better add it a s part of the build scripts and macros. Either
will work, even for binary install on windows f.e.. I would really
like to define a clear location relative to the extension directory.
This location will be used by this new feature, if implemented. I also
think that it should be either part of the current prepend setting, by
default and system only, or another setting. What I really consider as
a bad practice is any kind of bundling scripts in the extensions, that
will be a real pain to maintain and it opens some unknown can of
worms. Distros may also prefer to have that script outside the
extension, easier for cherry picks update when necessary (f.e.
security fixes).

I am sorry to say that but, with all due respect, I have a totally
opposite opinion.

IMHO, this PHP code (and any other data we want to embed) is an
integral part of the extension. A compiled extension is a consistent
piece of code, it is contained in a single file and should not be
split to separate files.

That's very much the case. One extension, one "install". It doesn't
matter whether some of the extension is written in C, and other parts in
PHP. HHVM is all about this. Making use of C where you need it, and
otherwise just write the simpler but integral border functionality in
PHP for faster maintenance and development.

It is not correct nor relevant to compare hhvm extension development with
this case. With hhvm it ends up with an actual native extension while here
it is proposed to bundled script that will be executed at runtime like any
other script, except that nothing can be done with them, not even disable
them if not required (like using its own glue codes).

Even zephir provides easy way to develop extension and again, it ends up
with native code.

The more arguments being brought in this thread the more I am convinced
that we are trying to tackle the wrong problems.

10 years ago by francois@tekwire.net — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here,
it is proposed to bundle scripts that will be executed at runtime like any
other script, except that nothing can be done with them, not even disable
them if not required (like using its own glue codes).

I agree. Bundling scripts in extensions to execute them at each RINIT is, IMO, not a good idea (mostly for performance reasons and lack of control, as you note), but I keep thinking that a mechanism to embed PHP scripts in extensions and make them available via a common stream wrapper can be useful. What I don't like is the fact to execute them automatically at every RINIT. I prefer to let the extension free to load its PHP code when its logic decides it is needed.

Pickle integration is almost finished, that solves your worries about composer and pecl extensions.

Both mechanisms can coexist. They are not incompatible and probably fulfill different needs.

You are asking for a standard path for plain files linked to an extension. As putting the shared library and plain files together is not a good idea, what about something like this :

<php-install-prefix>/lib/php/extensions/php/<extension-name>

A C macro can be defined to provide the path to the extension C code, in order to avoid hardcoding it (this way, it could be modified in the future).

At install time, we could say that every files under <extension-source-dir>/php/install, if this dir exists, are installed there.

It is just an example but does it correspond to the mechanism you were thinking about ?

What is not clear to me is the question of packaging/versioning/deployment. If you consider the PHP glue code is modified more often than the C code, which will be the most frequent case, how do you version/distribute partial packages ? or do you deliver 2 packages with independent versioning and dependencies together ? In this case, should we define a versioning convention for both packages ?

Regards

François

10 years ago by Pierre Joye — view source

unread

hi François,

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here,
it is proposed to bundle scripts that will be executed at runtime like any
other script, except that nothing can be done with them, not even disable
them if not required (like using its own glue codes).

I agree. Bundling scripts in extensions to execute them at each RINIT is, IMO, not a good idea (mostly for performance reasons and lack of control, as you note), but I keep thinking that a mechanism to embed PHP scripts in extensions and make them available via a common stream wrapper can be useful. What I don't like is the fact to execute them automatically at every RINIT. I prefer to let the extension free to load its PHP code when its logic decides it is needed.

Or let the extension defines a script to prepend, preload, whatever we
end to do.

It is however important, as you mentioned, to leave the engine and the
user the ability not to load it, at all.

Pickle integration is almost finished, that solves your worries about composer and pecl extensions.

Both mechanisms can coexist. They are not incompatible and probably fulfill different needs.

Indeed, but I did not refer to separate releases here but the ability
for Composer to install extension. If a script is part of an extension
release, it will be installed as well.

You are asking for a standard path for plain files linked to an extension. As putting the shared library and plain files together is not a good idea, what about something like this :

<php-install-prefix>/lib/php/extensions/php/<extension-name>

A C macro can be defined to provide the path to the extension C code, in order to avoid hardcoding it (this way, it could be modified in the future).

I suppose you mean to the extension PHP code, right? That's what is
done already for include_path for example. It has to be system and
optionally loaded.

At install time, we could say that every files under <extension-source-dir>/php/install, if this dir exists, are installed there.

It is just an example but does it correspond to the mechanism you were thinking about ?

Yes, and it is very easy to do and maintained. At least compared to
bundling script into binaries. Our build scripts already support
installing custom files out of the box for example.

What is not clear to me is the question of packaging/versioning/deployment. If you consider the PHP glue code is modified more often than the C code, which will be the most frequent case, how do you version/distribute partial packages ? or do you deliver 2 packages with independent versioning and dependencies together ? In this case, should we define a versioning convention for both packages ?

It is up to the maintainers. I would not do separate releases as it
makes it hard to track or support. Doing more releases even to "only"
fix the PHP glue codes sounds like a good practice anyway.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Lester Caine — view source

unread

What is not clear to me is the question of packaging/versioning/deployment. If you consider the PHP glue code is modified more often than the C code, which will be the most frequent case, how do you version/distribute partial packages ? or do you deliver 2 packages with independent versioning and dependencies together ? In this case, should we define a versioning convention for both packages ?
It is up to the maintainers. I would not do separate releases as it
makes it hard to track or support. Doing more releases even to "only"
fix the PHP glue codes sounds like a good practice anyway.

Since the large majority of PHP installs are already handled by third
party systems isn't this discussion somewhat academic?

I've commented in the past on the discrepancies between different
installations, and I still run an 'ini' structure which comes from the
SUSE package manager. Separate .ini files for each extension so I can
switch extensions on and off simply by the set of files selected. SUSE
handled all of the 'install' problems for a particular extension
ensuring all the right third part stuff is included and even creating
useful PHP additions.

The problem nowadays is that each distribution uses it's own set of
rules and even the third party windows distributions provide a modular
platform with their own way of handling modules, while the main PHP
distribution is still handled as a single one off 'core'.

I understand the problem that if something gets changed every single
extension has to be rebuilt but with all of the tools available today is
that really necessary? Can we not come up with a system for PHP7 that
allows a security fix to say one of the database packages without
needing to recompile the whole installation? Does any current Linux
distribution push the whole PHP stack when they include fixes in
extensions today?

--
Lester Caine - G8HFL

Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

10 years ago by Derick Rethans — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here, it is proposed to bundle scripts that will be executed at
runtime like any other script, except that nothing can be done with
them, not even disable them if not required (like using its own
glue codes).

I agree. Bundling scripts in extensions to execute them at each
RINIT is, IMO, not a good idea (mostly for performance reasons and
lack of control, as you note), but I keep thinking that a mechanism
to embed PHP scripts in extensions and make them available via a
common stream wrapper can be useful. What I don't like is the fact
to execute them automatically at every RINIT. I prefer to let the
extension free to load its PHP code when its logic decides it is
needed.

Or let the extension defines a script to prepend, preload, whatever we
end to do.

It is however important, as you mentioned, to leave the engine and the
user the ability not to load it, at all.

No no, not loading the code is exactly what I want to prevent people
from doing! The PHP library code is an integral part of the extension.
Without it, its APIs and functionality would be useless.

pickle, two packages, or any of that other mallargy are not going to fix
this. It needs to be one package for this to work well, without the
possibility of users messing things up.

Derick

10 years ago by Pierre Joye — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here, it is proposed to bundle scripts that will be executed at
runtime like any other script, except that nothing can be done with
them, not even disable them if not required (like using its own
glue codes).

I agree. Bundling scripts in extensions to execute them at each
RINIT is, IMO, not a good idea (mostly for performance reasons and
lack of control, as you note), but I keep thinking that a mechanism
to embed PHP scripts in extensions and make them available via a
common stream wrapper can be useful. What I don't like is the fact
to execute them automatically at every RINIT. I prefer to let the
extension free to load its PHP code when its logic decides it is
needed.

Or let the extension defines a script to prepend, preload, whatever we
end to do.

It is however important, as you mentioned, to leave the engine and the
user the ability not to load it, at all.

No no, not loading the code is exactly what I want to prevent people
from doing! The PHP library code is an integral part of the extension.
Without it, its APIs and functionality would be useless.

Sorry but yes. It is PHP code, and I, for one, want to be able to do
not use it if I do not like it or do not need it.

pickle, two packages, or any of that other mallargy are not going to fix
this. It needs to be one package for this to work well, without the
possibility of users messing things up.

Users are not stupid. It is possible to have one package only already. Take #24.

Now I will come back with the patch to provide a solution that users
cannot mess up, as you nicely said and examples. Keeping in mind that
it is already possible to do but not as easy and userfriendly as it
should be (like manual include or the like).

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Derick Rethans — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here,
it is proposed to bundle scripts that will be executed at runtime like any
other script, except that nothing can be done with them, not even disable
them if not required (like using its own glue codes).

I agree. Bundling scripts in extensions to execute them at each RINIT
is, IMO, not a good idea (mostly for performance reasons and lack of
control, as you note), but I keep thinking that a mechanism to embed
PHP scripts in extensions and make them available via a common stream
wrapper can be useful.

It should really be in MINIT... but I guess that doesn't work well.

What I don't like is the fact to execute them automatically at every
RINIT. I prefer to let the extension free to load its PHP code when
its logic decides it is needed.

I really don't see the problem if the PHP code is really part of the
extension. It just defines extra classes!

cheers,
Derick

10 years ago by Benjamin Eberlei — view source

unread

On Wed, Jan 7, 2015 at 11:14 PM, François Laupretre francois@tekwire.net
wrote:

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here,
it is proposed to bundle scripts that will be executed at runtime like
any
other script, except that nothing can be done with them, not even disable
them if not required (like using its own glue codes).

I agree. Bundling scripts in extensions to execute them at each RINIT is,
IMO, not a good idea (mostly for performance reasons and lack of control,
as you note), but I keep thinking that a mechanism to embed PHP scripts in
extensions and make them available via a common stream wrapper can be
useful. What I don't like is the fact to execute them automatically at
every RINIT. I prefer to let the extension free to load its PHP code when
its logic decides it is needed.

To be honest, I don't see the use case of shipping optional PHP code inside
an extension. As the user of an extension I want all the functions/classes
to be available all the time, no matter if the extension developer wrote
everything in C or in PHP.

For optional code there is Composer/PEAR/php include path, this is already
a solved problem.

Pickle integration is almost finished, that solves your worries about
composer and pecl extensions.

Both mechanisms can coexist. They are not incompatible and probably
fulfill different needs.

You are asking for a standard path for plain files linked to an extension.
As putting the shared library and plain files together is not a good idea,
what about something like this :

<php-install-prefix>/lib/php/extensions/php/<extension-name>

A C macro can be defined to provide the path to the extension C code, in
order to avoid hardcoding it (this way, it could be modified in the future).

At install time, we could say that every files under
<extension-source-dir>/php/install, if this dir exists, are installed there.

It is just an example but does it correspond to the mechanism you were
thinking about ?

What is not clear to me is the question of
packaging/versioning/deployment. If you consider the PHP glue code is
modified more often than the C code, which will be the most frequent case,
how do you version/distribute partial packages ? or do you deliver 2
packages with independent versioning and dependencies together ? In this
case, should we define a versioning convention for both packages ?

Regards

François

10 years ago by Pierre Joye — view source

unread

hi,

On Wed, Jan 7, 2015 at 11:14 PM, François Laupretre francois@tekwire.net
wrote:

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here,
it is proposed to bundle scripts that will be executed at runtime like
any
other script, except that nothing can be done with them, not even
disable
them if not required (like using its own glue codes).

I agree. Bundling scripts in extensions to execute them at each RINIT is,
IMO, not a good idea (mostly for performance reasons and lack of control, as
you note), but I keep thinking that a mechanism to embed PHP scripts in
extensions and make them available via a common stream wrapper can be
useful. What I don't like is the fact to execute them automatically at every
RINIT. I prefer to let the extension free to load its PHP code when its
logic decides it is needed.

To be honest, I don't see the use case of shipping optional PHP code inside
an extension. As the user of an extension I want all the functions/classes
to be available all the time, no matter if the extension developer wrote
everything in C or in PHP.

For optional code there is Composer/PEAR/php include path, this is already a
solved problem.

So you are saying that people should be forced to use your glue code
instead of implementing their own when it fits better? I can only
disagree. But even then, it does not make the bundling of php code in
binaries a smart idea.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Benjamin Eberlei — view source

unread

You are not forced to use the glue code, its just always available. That is
what an extension is about, always available code. An option would of
course be to not use the extension if you don't need it.

Developers can feel free to use the low level API just like PHP also ships
high and low level stream functions for example.

hi,

On Thu, Jan 8, 2015 at 3:41 AM, Benjamin Eberlei kontakt@beberlei.de
wrote:

On Wed, Jan 7, 2015 at 11:14 PM, François Laupretre <
francois@tekwire.net>
wrote:

De : Pierre Joye [mailto:pierre.php@gmail.com]

... here,
it is proposed to bundle scripts that will be executed at runtime like
any
other script, except that nothing can be done with them, not even
disable
them if not required (like using its own glue codes).

I agree. Bundling scripts in extensions to execute them at each RINIT
is,
IMO, not a good idea (mostly for performance reasons and lack of
control, as
you note), but I keep thinking that a mechanism to embed PHP scripts in
extensions and make them available via a common stream wrapper can be
useful. What I don't like is the fact to execute them automatically at
every
RINIT. I prefer to let the extension free to load its PHP code when its
logic decides it is needed.

To be honest, I don't see the use case of shipping optional PHP code
inside
an extension. As the user of an extension I want all the
functions/classes
to be available all the time, no matter if the extension developer wrote
everything in C or in PHP.

For optional code there is Composer/PEAR/php include path, this is
already a
solved problem.

So you are saying that people should be forced to use your glue code
instead of implementing their own when it fits better? I can only
disagree. But even then, it does not make the bundling of php code in
binaries a smart idea.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by francois@tekwire.net — view source

unread

Quoting Benjamin Eberlei :

To be honest, I don't see the use case of shipping optional PHP code inside
an extension. As the user of an extension I want all the functions/classes
to be available all the time, no matter if the extension developer wrote
everything in C or in PHP.

Quoting Pierre :

So you are saying that people should be forced to use your glue code
instead of implementing their own when it fits better? I can only
disagree.

Quoting myself :

What I don't like is the fact to execute PHP code automatically at every
RINIT. I prefer to let the extension free to load its PHP code when its
logic decides it is needed.

We are all talking about PHP scripts but with very different usages :

Pierre is talking about 'glue code'. From what I understand, this code is used directly from userspace but optional, and the final user can decide to use alternate code. So, it is legitimate that usage of this code is under the control of the final user and not to bundle it in the extension library.
Benjamin is talking about user-exposed essential extension classes. These MUST be defined at each request start or the extension loses part, if not all, of its API. His pov is legitimate too. The only prevention I would have is the overall impact on performance but, if the final user accepts it knowingly, why not ?
My use case is slightly more complex. My PHP contains pure 'back-end' features. This code is used exclusively from the extension C code and does not expose anything to userspace. To explain my use case to Benjamin, here is an example from the PHK extension : PHK is a package system. It distributes a virtual file tree. When I need to stat or read a virtual file, the C code first looks if the required data is present in a cache. If not, it loads the appropriate PHP code and calls it to retrieve the data from the package file. This PHP code is long, complex, and deals with many cases and options (virtual file tree, compression, package metadata...). Then, the C code caches the returned data. The next time the same resource is requested, it is retrieved from the cache WITHOUT having to load and execute the PHP code. The C code can decide when PHP code needs to be loaded because it is the only 'user' for this code (pure back-end). I can implement the complex code in PHP because it is behind a cache, but it is essential, for performance reasons, that this code is not loaded when everything I need is in cache. I'd prefer to embed my PHP code in the extension library because I definitely need to keep both codes in sync, and keeping them physically together is, IMHO, the easiest way to ensure this. Today, the runtime PHP code is stored in the package itself. It allows to make the extension optional, but with BC issues I wouldn't have if both codes were stored together.

So, IMO, all 3 usages are legitimate but each of us has something different in mind.

The solution for Pierre's needs is in build/deployment tools and ,maybe, a pair of C macros to define standard paths (I proposed one, what do you think about it ?).

Benjamin and I would make full use of a generic, easy-to-use mechanism to embed plain files (not only PHP scripts, any data can fit) in an extension library. Then the extension developer decides what it loads at each request start. I'll try to build a prototype for this.

Cheers

François

10 years ago by Fred Emmott — view source

unread

That's very much the case. One extension, one "install". It doesn't
matter whether some of the extension is written in C, and other parts in
PHP. HHVM is all about this. Making use of C where you need it, and
otherwise just write the simpler but integral border functionality in
PHP for faster maintenance and development.

It is not correct nor relevant to compare hhvm extension development with
this case. With hhvm it ends up with an actual native extension while here
it is proposed to bundled script that will be executed at runtime like any
other script, except that nothing can be done with them, not even disable
them if not required (like using its own glue codes).

Both HHVM and its' extensions embed PHP source in the binaries.

In our experience:

bundling them together makes it much quicker to iterate and improve extensions.
the PHP code tends to use ‘non-public’ parts of the C API - so it really needs to be kept in sync with the PHP version. I’m not sure if composer currently supports ‘if I’m on PHP 5.6.0 use this library, if I’m on 5.6.1 use this library…’. While this is fixable, it seems like a lot of boilerplate.
If PHP7 does develop a JIT, the PHP implementations could be faster than calling the C code (type specializing, and the overhead of setting up the stack for a C function call). This is often the case for HHVM.

we currently do not support extension-style parameter coercion in PHP functions; this leads to small incompatibilities in our case, which we usually address by writing a C++ function that proxies to a hidden PHP implementation. For PHP7, if the current scalar types RFC lands, this won’t be a problem; if it doesn’t, there’s a chance that multiple functions from the same extension will have confusingly different parameter coercing behavior, depending on the author’s rigor.

We solved some of the other problems by only allowing definitions, no side-effects - classes, constants, namespaces etc. We do not allow code outside of a function body here.

HHVM-specific details (just FYI, agree that they’re probably not particularly relevant):

We do not translate the PHP source to native code (unless it ends up being jitted at runtime)
We’re planning on supporting the extension-style coercion in Hack code with a user attribute
To dig the code out:

$ objdump -h hhvm | egrep ‘ (ext|systemlib)’ # also works for a dynamic extension .so
…
149 ext.ff104b2dfab9 00002410 0000000000000000 0000000000000000 02d350a5 20
150 systemlib 000766a2 0000000000000000 0000000000000000 02d374b5 20
$ objdump -s -j systemlib hhvm | head -n 20

hhvm: file format elf64-x86-64

Contents of section systemlib:
00000 3c3f6868 0a2f2f20 7b407d67 656e6572 <?hh.// {@}gener
00010 61746564 0a0a6e61 6d657370 61636520 ated..namespace
00020 7b0a0a2f 2f206465 6661756c 74206261 {..// default ba
00030 73650a63 6c617373 20737464 436c6173 se.class stdClas
00040 73207b0a 7d0a0a2f 2f207573 65642069 s {.}..// used i
00050 6e20756e 73657269 616c697a 65282920 n unserialize()
00060 666f7220 756e6b6e 6f776e20 636c6173 for unknown clas
00070 7365730a 636c6173 73205f5f 5048505f ses.class _PHP
00080 496e636f 6d706c65 74655f43 6c617373 Incomplete_Class
00090 207b0a20 20707562 6c696320 245f5f50 {. public $__P
000a0 48505f49 6e636f6d 706c6574 655f436c HP_Incomplete_Cl
000b0 6173735f 4e616d65 3b0a7d0a 7d0a0a6e ass_Name;.}.}..n
000c0 616d6573 70616365 207b0a0a 2f2f2055 amespace {..// U
000d0 73656420 61732061 2073656e 74696e65 sed as a sentine
000e0 6c207479 70652069 6e203836 70696e69 l type in 86pini
000f0 7428292e 0a636c61 7373205f 5f70696e t()..class __pin

10 years ago by Pierre Joye — view source

unread

That's very much the case. One extension, one "install". It doesn't
matter whether some of the extension is written in C, and other parts in
PHP. HHVM is all about this. Making use of C where you need it, and
otherwise just write the simpler but integral border functionality in
PHP for faster maintenance and development.

It is not correct nor relevant to compare hhvm extension development with
this case. With hhvm it ends up with an actual native extension while here
it is proposed to bundled script that will be executed at runtime like any
other script, except that nothing can be done with them, not even disable
them if not required (like using its own glue codes).

Both HHVM and its' extensions embed PHP source in the binaries.

In our experience:

bundling them together makes it much quicker to iterate and improve extensions.

Releasing them together allows that. Bundling does not help to improve
extensions, eventually make it easier to install but that's about it.

the PHP code tends to use ‘non-public’ parts of the C API - so it really needs to be kept in sync with the PHP version. I’m not sure if composer currently supports ‘if I’m on PHP 5.6.0 use this library, if I’m on 5.6.1 use this library…’. While this is fixable, it seems like a lot of boilerplate.

If PHP7 does develop a JIT, the PHP implementations could be faster than calling the C code (type specializing, and the overhead of setting up the stack for a C function call). This is often the case for HHVM.

How does it help? For php, an extension will be built against a given
version. Unless a script per version is created, bundling script will
be of no help for extensions developers as most of the extensions can
be built against many versions.

I have no idea if there are custom hhvm extensions out there, but they
will have to deal with the same issue. How to make it work with
multiple version of hhvm.This is not the case now for the bundled
ones, they are kept in sync with hhvm directly, just like the core php
extensions are in sync with the engine changes.

Off topic: btw, how we want to deal with 5.x and 7.x from an extension
point of view is still a big question. Multiple branches, multiple src
files, multiple releases, etc.

we currently do not support extension-style parameter coercion in PHP functions; this leads to small incompatibilities in our case, which we usually address by writing a C++ function that proxies to a hidden PHP implementation. For PHP7, if the current scalar types RFC lands, this won’t be a problem; if it doesn’t, there’s a chance that multiple functions from the same extension will have confusingly different parameter coercing behavior, depending on the author’s rigor.

I hope it will pass, we cruelly need that :)

We solved some of the other problems by only allowing definitions, no side-effects - classes, constants, namespaces etc. We do not allow code outside of a function body here.

Yes, we will have to do that, no matter what we choose. I am however
not sure how to do it with the current engine design. Using
zend_compile (or a derived implementation) may help.

HHVM-specific details (just FYI, agree that they’re probably not particularly relevant):

We do not translate the PHP source to native code (unless it ends up being jitted at runtime)

Which is very likely to be the case for codes used in many requests.
PHP does not have JIT, certainly not in 6.0, maybe in 6.x tho'

We’re planning on supporting the extension-style coercion in Hack code with a user attribute

All in all, I have a hard time to compare hhvm to php in this case. It
is not really PHP. hhvm is by design much more suited to allow that,
additional syntax and JIT make this whole thing a totally different
story.

The fact that hhvm implements a significant part of the extensions (or
other areas) using PHP+additional syntax as well as adding cleaner
APIs or mechanisms for the C parts only confirms me one thing: the
very 1st problem we have to solve is to ease the extension creation,
by drastically changing the internals APIs & tools. Bundling script
does not help here, we are using a scotch tape to repair something
that should have been replaced or redesigned since long already. I am
not blaming anyone, the engine design, historically, does not make
such changes easy.

Cheers,
Pierre

10 years ago by francois@tekwire.net — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

We solved some of the other problems by only allowing definitions, no
side-effects - classes, constants, namespaces etc. We do not allow code
outside of a function body here.

Yes, we will have to do that, no matter what we choose. I am however
not sure how to do it with the current engine design. Using
zend_compile (or a derived implementation) may help.

Is there a technical problem with code located outside of definitions, or is it just a convention (PSR already contains this rule) ? I understand the problems it poses in HHVM, but I don't see why we should check that with the engine as it is structured now. Even if we can detect such questionable usage (and we can detect it quite easily just using tokenizer output), IMO, we are not responsible of what developers put in their scripts.

Regards,

François

10 years ago by Pierre Joye — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

We solved some of the other problems by only allowing definitions, no
side-effects - classes, constants, namespaces etc. We do not allow code
outside of a function body here.

Yes, we will have to do that, no matter what we choose. I am however
not sure how to do it with the current engine design. Using
zend_compile (or a derived implementation) may help.

Is there a technical problem with code located outside of definitions, or is it just a convention (PSR already contains this rule) ? I understand the problems it poses in HHVM, but I don't see why we should check that with the engine as it is structured now. Even if we can detect such questionable usage (and we can detect it quite easily just using tokenizer output), IMO, we are not responsible of what developers put in their scripts.

I was too vague.

If some code is loaded and part of it will be always executed even if
not used at all in a request, this will have side effects. Also
besides the impact of having code being ran on each request, I do not
think it is good idea to being to have top context running code, at
all :)

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by francois@tekwire.net — view source

unread

-----Message d'origine-----
De : Pierre Joye [mailto:pierre.php@gmail.com]

Is there a technical problem with code located outside of definitions, or is it
just a convention (PSR already contains this rule) ? I understand the
problems it poses in HHVM, but I don't see why we should check that with
the engine as it is structured now. Even if we can detect such questionable
usage (and we can detect it quite easily just using tokenizer output), IMO,
we are not responsible of what developers put in their scripts.

I was too vague.

If some code is loaded and part of it will be always executed even if
not used at all in a request, this will have side effects. Also
besides the impact of having code being ran on each request, I do not
think it is good idea to being to have top context running code, at
all :)

OK, I understand that it is a very bad idea but I am not sure we should spend time working on tools to avoid this.

The first reason is that there may be cases where this would be legitimate. I don't imagine any today but we are not aware of every ways PHP can be used.

The second reason is that it is not a feature for newbies. People writing code to be executed at each request start are supposed to know what they're doing. If they are fool enough to introduce such crap, it is their problem, not ours.

Cheers

François

10 years ago by Pierre Joye — view source

unread

-----Message d'origine-----
De : Pierre Joye [mailto:pierre.php@gmail.com]

Is there a technical problem with code located outside of
definitions, or is it
just a convention (PSR already contains this rule) ? I understand the
problems it poses in HHVM, but I don't see why we should check that with
the engine as it is structured now. Even if we can detect such
questionable
usage (and we can detect it quite easily just using tokenizer output),
IMO,
we are not responsible of what developers put in their scripts.

I was too vague.

If some code is loaded and part of it will be always executed even if
not used at all in a request, this will have side effects. Also
besides the impact of having code being ran on each request, I do not
think it is good idea to being to have top context running code, at
all :)

OK, I understand that it is a very bad idea but I am not sure we should
spend time working on tools to avoid this.

The first reason is that there may be cases where this would be
legitimate. I don't imagine any today but we are not aware of every ways
PHP can be used.

The second reason is that it is not a feature for newbies. People writing
code to be executed at each request start are supposed to know what they're
doing. If they are fool enough to introduce such crap, it is their problem,
not ours.

It became a user and support (which too often begins at php.net first)

Cheers

François

10 years ago by francois@tekwire.net — view source

unread

-----Message d'origine-----
De : php@golemon.com [mailto:php@golemon.com] De la part de Sara Golemon

So, I've been meaning to propose something like this, but with a few
key implementation detail differences:

Create the notion of "Persistent User Functions/Classes/Constants/etc

Embedded text sections

First, persistent PHP code or data (NOT reloading code for each request) is a huge can of worms. If Sara decides to re-launch the idea, that will be great but the subject is probably out of scope here.

IMHO, what we need here is a generic mechanism to embed PHP code in C extensions and, then, execute it. It should define a way to embed PHP scripts as C strings at compile time, how to make these scripts known and accessible from the PHP core, and how to execute them when needed.

Instead of including the runtime code in every extensions, as it is done in ext-embed, I would prefer an extension which would provide this service to other 'client' extensions. It would define a stream wrapper to access the registered PHP scripts. Script registration would be registered in persistent memory during MINIT. The registration would return an ID for each registered script. These IDs would be stored by the client extension and would be used to derive a stream-wrapped path when the client decides to execute a registered script. These paths would uniquely reference the scripts, which would allow to opcode-cache them.

As a specific case, if an extension wants to execute some PHP code at the beginning of each request, it can execute its script(s) during RINIT, but it is just a use case of a more general mechanism. RINIT execution order can be resolved using module dependencies.

Extending the mechanism to userspace 'plain file' scripts could be provided by a 'bridge'. The user could register a plain file script, whose content would be stored in persistent memory, with a flag that would cause the script to be loaded during every RINIT.

One more word about executing scripts during every RINIT : even with opcode caches, as Sara notes, the performance hit is not negligible. The question is always the same : What is the percentage of requests that will really use the classes/functions we define there ? Remember that the list of scripts is hardcoded in an extension. If the extension wants to allow conditional loading for different features, it needs to define appropriate ini settings. Choosing the list of scripts at compile time is another option but would make the distribution of precompiled binaries complex.

I personally favor another approach. I still use the PHK extension as an example of this 'intermediate' approach. This extension defines C-levels front-end classes. These classes contain the 'fast' code. When this code needs to access 'slow' PHP code, it executes the PHP scripts it needs, then calls the just-defined PHP classes. This way, we preserve the possibility to code a big part of the extension in PHP, without the overhead of loading the PHP code at the beginning of each request.

Thanks for your attention :)

François

10 years ago by Pierre Joye — view source

unread

hi,

-----Message d'origine-----
De : php@golemon.com [mailto:php@golemon.com] De la part de Sara Golemon

So, I've been meaning to propose something like this, but with a few
key implementation detail differences:

Create the notion of "Persistent User Functions/Classes/Constants/etc

Embedded text sections

First, persistent PHP code or data (NOT reloading code for each request) is a huge can of worms. If Sara decides to re-launch the idea, that will be great but the subject is probably out of scope here.

IMHO, what we need here is a generic mechanism to embed PHP code in C extensions and, then, execute it. It should define a way to embed PHP scripts as C strings at compile time, how to make these scripts known and accessible from the PHP core, and how to execute them when needed.

Instead of including the runtime code in every extensions, as it is done in ext-embed, I would prefer an extension which would provide this service to other 'client' extensions. It would define a stream wrapper to access the registered PHP scripts. Script registration would be registered in persistent memory during MINIT. The registration would return an ID for each registered script. These IDs would be stored by the client extension and would be used to derive a stream-wrapped path when the client decides to execute a registered script. These paths would uniquely reference the scripts, which would allow to opcode-cache them.

As a specific case, if an extension wants to execute some PHP code at the beginning of each request, it can execute its script(s) during RINIT, but it is just a use case of a more general mechanism. RINIT execution order can be resolved using module dependencies.

Extending the mechanism to userspace 'plain file' scripts could be provided by a 'bridge'. The user could register a plain file script, whose content would be stored in persistent memory, with a flag that would cause the script to be loaded during every RINIT.

One more word about executing scripts during every RINIT : even with opcode caches, as Sara notes, the performance hit is not negligible. The question is always the same : What is the percentage of requests that will really use the classes/functions we define there ? Remember that the list of scripts is hardcoded in an extension. If the extension wants to allow conditional loading for different features, it needs to define appropriate ini settings. Choosing the list of scripts at compile time is another option but would make the distribution of precompiled binaries complex.

I personally favor another approach. I still use the PHK extension as an example of this 'intermediate' approach. This extension defines C-levels front-end classes. These classes contain the 'fast' code. When this code needs to access 'slow' PHP code, it executes the PHP scripts it needs, then calls the just-defined PHP classes. This way, we preserve the possibility to code a big part of the extension in PHP, without the overhead of loading the PHP code at the beginning of each request.

Thanks for your attention :)

It looks to me that it is going slightly too far, or I totally miss
why it is such a critical feature.

As a developer of (too) many extensions, I can only agree with other
developers saying that many times, writing some glue code in PHP is
way easier than doing it in C using the PHP internals APIs. This is a
one of the biggest problem PHP internals have. Sara and many other
pointed out this problem numerous times in the past. PHP 7 is an
unique opportunity to provide easier APIs to develop extensions even
if we keep the existing ones for obvious reasons.

For example HHVM provides a neat one, which is simply PHP for simple
cases. Complex cases allow advanced the native support is also
supported. See https://github.com/facebook/hhvm/wiki/Extension-API.

Zephir (https://github.com/phalcon/zephir) provides also a clean
solution and easier to maintain or implement PHP extensions.

Python or Perl also provides much cleaner APIs and mix of C and
scripts for extensions developers.

To me, it looks like this is what we should try to solve instead of
creating something we will most likely regret very soon. Benjamin's
original proposal (if technically possible in such a simple way, but I
am not sure about that :) ) is somehow more acceptable than Sara's one
(statically built scrpts with the extension. For one, statically build
scripts into extensions looks to me like statically linking libc. It
also kills any of the advantages of having a glue code, from a release
point of view.

The last thing I do not understand, from Benjamin's point of view, is
the performance part. PHP glue code is used for areas where
performance is not a matter, but then we may end adding this
featurebecause the performance impact is too big. What do I miss? If
the prepend or script loading process are too slow, we should try to
fix this bottleneck instead of opening yet another can of worms.
However I do not see this part of a request as the bottlenecks for
most PHP apps (with opcache).

I do not try to be negative or aggressively opposed to this feature. I
simply think we are trying to solve the wrong parts of the problems.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Pierre Joye — view source

unread

For example HHVM provides a neat one, which is simply PHP for simple
cases. Complex cases allow advanced the native support is also
supported. See https://github.com/facebook/hhvm/wiki/Extension-API.

... Complex cases can be implemented using the native support. Sorry
for some c/p mess while rephrasing it :)

10 years ago by Remi Collet — view source

unread

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Le 04/01/2015 12:52, Benjamin Eberlei a écrit :

https://wiki.php.net/rfc/extension_prepend_files

Sorry but definitively seems a bad idea.

If you want a pure-php library, provide as one.

If this library need an extension, just describe this in the metadata
(pecl, composer, ...)

Ex:

- twig extension (of the C extension is optional in this case)
- xhprof (or its fork)

Having a "huge" piece of code included in "each" request will be a
nightmare.

Yes, PHP library can be managed per host/dir/app/whatever
While extension are enable for all the SAPI.

Seems you are trying to solve a "downstream" issue, not a PHP one.

Remi.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlSuc0AACgkQYUppBSnxahiKxQCgpTnfu8Jpjz3nauIRCZGZHvi9
asAAniYb0b8m7Sn9I5MVooxssD5a2oS1
=dkkN
-----END PGP SIGNATURE

10 years ago by Benjamin Eberlei — view source

unread

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Le 04/01/2015 12:52, Benjamin Eberlei a écrit :

https://wiki.php.net/rfc/extension_prepend_files

Sorry but definitively seems a bad idea.

If you want a pure-php library, provide as one.

If this library need an extension, just describe this in the metadata
(pecl, composer, ...)

Ex:

twig extension (of the C extension is optional in this case)

xhprof (or its fork)

Having a "huge" piece of code included in "each" request will be a
nightmare.

This argument is irrelevant to the RFCs goal in my opinion.

This RFC is about making it possible "somehow" to bundle PHP code with
extensions that is required to make the extension work in a sane way for
the end-user.
The size of that code or the performance impact is irrelevant imho.
Users of extensions with this functionality can decide how this affects
their server performance and if they are ok with that.
Code that is optional can still be a Composer/PEAR/anything library.
Everything CAN be misused. You can already write an extension that
includes LOTS of code in RINIT. This RFC is only about defining a sane
process for allowing this, instead of a hack that it is now.

Examples of good use-cases for this feature:

Low-Level MongoDB connection code in C, userland OOP API in PHP.
Low-Level Crypto code, simplified PHP functions (think ext/hash +
ext/password)
Database Vendor Extensions in C + common DB abstraction in PHP (PDO2).
Low-Level Date handling, high level PHP code

This RFC tries to solve the current approach to extensions:

Writing everything in C, which is difficult to maintain and more prone
to nasty bugs (segfaults, memory leaks).
Requiring a PHP library to be installed, which is makes installation
complicated as its not supported by PHPs extension tooling in a
straightforward way right now.

I should rewrite the RFC and remove the implementation details, because
essentially the solution could also be tooling based (vs code based).

Yes, PHP library can be managed per host/dir/app/whatever
While extension are enable for all the SAPI.

Seems you are trying to solve a "downstream" issue, not a PHP one.

Remi.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlSuc0AACgkQYUppBSnxahiKxQCgpTnfu8Jpjz3nauIRCZGZHvi9
asAAniYb0b8m7Sn9I5MVooxssD5a2oS1
=dkkN
-----END PGP SIGNATURE

10 years ago by Adam Harvey — view source

unread

I'm going to be a bit hazier than normal in this e-mail, for which I
apologise. People who know who I work for, you can probably guess the
parameters of the NDA I'm trying not to break here.

<+1 on everything I snipped>

Examples of good use-cases for this feature:

Low-Level MongoDB connection code in C, userland OOP API in PHP.

Low-Level Crypto code, simplified PHP functions (think ext/hash +
ext/password)

Database Vendor Extensions in C + common DB abstraction in PHP (PDO2).

Low-Level Date handling, high level PHP code

Let me toss another use case onto the fire.

Imagine you have an extension that replaces the zend_execute_ex
pointer so it can fire hooks before and after a particular function is
called. You can write those hooks in C, but there's no actual reason
they need to be — they're not performance-critical, and don't require
access to any internal APIs.

At that point, it would be nice to have a mechanism for shipping PHP
code with your C extension that doesn't require any external
dependencies. As Derick says, "pecl install foo", not "pecl install
foo && composer require foo/bar && some sort of startup code in
userland".

This code isn't optional: it's required for your extension to behave
properly. It's intimately tied to the exact extension you're shipping
— you don't want to expose a stable API to userland for this, because
there's no need, and it's irrelevant for users anyway. Why not allow a
way for extension authors to ship this code as (hopefully) safe,
managed PHP instead of C, in a way that isn't reliant on tooling and
could allow version drift between the C and PHP code bases?

This isn't a new idea. We've talked on IRC about shipping bits of the
standard library as PHP code instead of C for years. Having this
mechanism — whatever form it ends up taking — would help there.

I should rewrite the RFC and remove the implementation details, because
essentially the solution could also be tooling based (vs code based).

It could, but I think there's a benefit in having a non-tooling based
way to do it. Much as I (genuinely) wish everything was open source
and could be installed through PECL, there are plenty of closed source
extensions for PHP.

Adam

10 years ago by Pierre Joye — view source

unread

I'm going to be a bit hazier than normal in this e-mail, for which I
apologise. People who know who I work for, you can probably guess the
parameters of the NDA I'm trying not to break here.

<+1 on everything I snipped>

Examples of good use-cases for this feature:

Low-Level MongoDB connection code in C, userland OOP API in PHP.

Low-Level Crypto code, simplified PHP functions (think ext/hash +
ext/password)

Database Vendor Extensions in C + common DB abstraction in PHP (PDO2).

Low-Level Date handling, high level PHP code

Let me toss another use case onto the fire.

Imagine you have an extension that replaces the zend_execute_ex
pointer so it can fire hooks before and after a particular function is
called. You can write those hooks in C, but there's no actual reason
they need to be — they're not performance-critical, and don't require
access to any internal APIs.

This is something totally different to what we are talking about here.
Yes, such needs will benefit of having a script released with the
extension itself (in one form or another), but such hooks are really
something totally different.

At that point, it would be nice to have a mechanism for shipping PHP
code with your C extension that doesn't require any external
dependencies. As Derick says, "pecl install foo", not "pecl install
foo && composer require foo/bar && some sort of startup code in
userland".

And he is wrong here. It is possible, today.

This code isn't optional: it's required for your extension to behave
properly. It's intimately tied to the exact extension you're shipping
— you don't want to expose a stable API to userland for this, because
there's no need, and it's irrelevant for users anyway. Why not allow a
way for extension authors to ship this code as (hopefully) safe,
managed PHP instead of C, in a way that isn't reliant on tooling and
could allow version drift between the C and PHP code bases?

It is possible already, let make it even easier, no?

This isn't a new idea. We've talked on IRC about shipping bits of the
standard library as PHP code instead of C for years. Having this
mechanism — whatever form it ends up taking — would help there.

I should rewrite the RFC and remove the implementation details, because
essentially the solution could also be tooling based (vs code based).

It could, but I think there's a benefit in having a non-tooling based
way to do it. Much as I (genuinely) wish everything was open source
and could be installed through PECL, there are plenty of closed source
extensions for PHP.

Btw, closed source extensions could be installed by the new pickle
too. While I do not consider closed PHP scripts source as relevant,
never did and most likely never will :)

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Pierre Joye — view source

unread

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

Just for the record here:

A proof of concept, IRC log, I am lazy :)

<Pierre> benjamin, Derick
https://gist.github.com/pierrejoye/ce4867a5eaabffa71df4
https://gist.github.com/pierrejoye/0859e3702ceb3bb652b6
https://gist.github.com/pierrejoye/544e60d8994094c55583
<Pierre> too slow internet for a fork & PR
<Pierre> but it works now. Add PHP_BUILTIN_SCRIPT(date,
PHP_EXT_DIR(date)/date.php) to config.m4, and call manually
zend_execute_script in RINIT, could be easier to do it in a register
function, inside MINIT and let the engine do it on RINIT, actually
cleaner, but this patch is only a prrof of concept to play with

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Pierre Joye — view source

unread

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

Just for the record here:

A proof of concept, IRC log, I am lazy :)

<Pierre> benjamin, Derick
https://gist.github.com/pierrejoye/ce4867a5eaabffa71df4
https://gist.github.com/pierrejoye/0859e3702ceb3bb652b6
https://gist.github.com/pierrejoye/544e60d8994094c55583
<Pierre> too slow internet for a fork & PR
<Pierre> but it works now. Add PHP_BUILTIN_SCRIPT(date,
PHP_EXT_DIR(date)/date.php) to config.m4, and call manually
zend_execute_script in RINIT, could be easier to do it in a register
function, inside MINIT and let the engine do it on RINIT, actually
cleaner, but this patch is only a prrof of concept to play with

Also keep in mind that Sara's proposal ideas to actually tackle the
real problem has my preference, over all other solutions :)

--
Pierre

@pierrejoye | http://www.libgd.org

10 years ago by francois@tekwire.net — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

A proof of concept, IRC log, I am lazy :)

<Pierre> benjamin, Derick
https://gist.github.com/pierrejoye/ce4867a5eaabffa71df4
https://gist.github.com/pierrejoye/0859e3702ceb3bb652b6
https://gist.github.com/pierrejoye/544e60d8994094c55583
<Pierre> too slow internet for a fork & PR
<Pierre> but it works now. Add PHP_BUILTIN_SCRIPT(date,
PHP_EXT_DIR(date)/date.php) to config.m4, and call manually
zend_execute_script in RINIT, could be easier to do it in a register
function, inside MINIT and let the engine do it on RINIT, actually
cleaner, but this patch is only a prrof of concept to play with

Thanks for this. I thought you were opposed to bundling PHP code in extension... :)

I prefer the solution where the extension execute the script(s) in RINIT or at any time, instead of registering scripts in the core and let the core trigger script execution. The main reason is that executing EVERY scripts in RINIT is just one use case among many others.

The problem I see with zend_execute_string() is the relationship with opcode caches, because the filename you provide is not a real filename. When an opcode cache receives such a name, it will analyze it as a valid plain file name and it will try to stat() it for mtime, which will probably fail... Actually, the opcode cache has no way to understand what's going on here. That's why I proposed to use a stream-wrapper. It would imply a registration mechanism to ensure path unicity but it would allow opcode caching. And opcode caching is most important here, especially for code executed systematically at RINIT time.

Cheers

François

10 years ago by Pierre Joye — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

A proof of concept, IRC log, I am lazy :)

<Pierre> benjamin, Derick
https://gist.github.com/pierrejoye/ce4867a5eaabffa71df4
https://gist.github.com/pierrejoye/0859e3702ceb3bb652b6
https://gist.github.com/pierrejoye/544e60d8994094c55583
<Pierre> too slow internet for a fork & PR
<Pierre> but it works now. Add PHP_BUILTIN_SCRIPT(date,
PHP_EXT_DIR(date)/date.php) to config.m4, and call manually
zend_execute_script in RINIT, could be easier to do it in a register
function, inside MINIT and let the engine do it on RINIT, actually
cleaner, but this patch is only a prrof of concept to play with

Thanks for this. I thought you were opposed to bundling PHP code in
extension... :)

Does not mean I cannot help to have a base to discuss.

I prefer the solution where the extension execute the script(s) in RINIT
or at any time, instead of registering scripts in the core and let the core
trigger script execution. The main reason is that executing EVERY scripts
in RINIT is just one use case among many others.

It will remain in RINIT, it simply could be calls e by the engine
automatically. That would allow running order too.

The problem I see with zend_execute_string() is the relationship with
opcode caches, because the filename you provide is not a real filename.
When an opcode cache receives such a name, it will analyze it as a valid
plain file name and it will try to stat() it for mtime, which will probably
fail... Actually, the opcode cache has no way to understand what's going on
here. That's why I proposed to use a stream-wrapper. It would imply a
registration mechanism to ensure path unicity but it would allow opcode
caching. And opcode caching is most important here, especially for code
executed systematically at RINIT time.

Opcache is why I think we should have a list registered names. A simple
hash exists and the cache will know what to do.

Cheers

François

10 years ago by francois@tekwire.net — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

Opcache is why I think we should have a list registered names. A simple hash exists and the cache will know what to do.

Sorry, I am not sure I understand how the opcode cache, as it exists now, can understand this. Do you mean that opcode cache code would need to be modified ?

Anyway, that's the occasion to ask this : do we consider opcache the only supported opcode cache in the future or do we still support APC and other alternative opcode caches. I'd like to know if we are free to improve the way opcode cache communicates with the core, or if we must keep BC.

Regards

François

10 years ago by Pierre Joye — view source

unread

On Sat, Jan 10, 2015 at 9:12 AM, François Laupretre
francois@tekwire.net wrote:

De : Pierre Joye [mailto:pierre.php@gmail.com]

Opcache is why I think we should have a list registered names. A simple hash exists and the cache will know what to do.

Sorry, I am not sure I understand how the opcode cache, as it exists now, can understand this. Do you mean that opcode cache code would need to be modified ?

Yes and no. Yes if we want them to do not even try to update files
that are statically built in extensions. And yes, if we want that (for
whatever reasons I cannot think about right now).

Anyway, that's the occasion to ask this : do we consider opcache the only supported opcode cache in the future or do we still support APC and other alternative opcode caches. I'd like to know if we are free to improve the way opcode cache communicates with the core, or if we must keep BC.

That's PHP7, we can and have already broke BC internally.

Patch now in my fork (finally got a faster connection):

https://github.com/pierrejoye/php-src/compare/php:master...master

Date use as example.

Key are two parts:

. the configure script to create the static data (could actually be
used for any binary data btw)
. the zend_execute_string call, which could happen anywhere where it
could be done, RINIT making more sense right now

As Sara pointed out , persistent opcode would be way better but as far
as I remember it is something very tricky to do (due to how the
address and offset are stored), maybe it changed now for 7, I did not
check that part. In any case the exec_string and the binary to C
scripts will be helpful there too.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

10 years ago by francois@tekwire.net — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

Sorry, I am not sure I understand how the opcode cache, as it exists now,
can understand this. Do you mean that opcode cache code would need to be
modified ?

Yes and no. Yes if we want them to do not even try to update files
that are statically built in extensions. And yes, if we want that (for
whatever reasons I cannot think about right now).

Sorry, it must be too late :)

AFAIR, the opcode cache attempts a stat() call on the filename it receives to get its mtime and use it to determine if file was modified since it was cached. I don't understand how an opcode cache can do a stat() on the filename you provide as, without a stream wrapper, there is no way to access such information (which doesn't even have any sense here). Your filename will be recognized as a plain filename, the libc stat() call will look for this filename in the current directory, it won't be found, and opcode caching will fail. If the file doesn't have an associated stream-wrapped path, you cannot issue a stat() on it and I don't see how it can be cached. Where am I wrong ?

That's PHP7, we can and have already broken BC internally.

So, you confirm that opcache is the only supported opcode cache in PHP 7 ? That's a good thing. I know that PHP7 breaks BC, I just wanted to know if developments in this branch had to remain compatible with APC and friends.

As Sara pointed out , persistent opcode would be way better but as far
as I remember it is something very tricky to do (due to how the
address and offset are stored), maybe it changed now for 7, I did not
check that part.

I remember we already discussed about PHP userspace code persistence years ago. What I remember most is that it was a nightmare with tons of side effects, but we were including PHP data persistence in the scope, which one more step towards hell. Maybe the core has evolved and things are easier now, especially in PHP7.

Anyway, from what I remember, Sara is the best one to design and implement an amazing solution in an amazingly short time (no flattery here :)

Regards

François

10 years ago by Pierre Joye — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

Sorry, I am not sure I understand how the opcode cache, as it exists
now,
can understand this. Do you mean that opcode cache code would need to be
modified ?

Yes and no. Yes if we want them to do not even try to update files
that are statically built in extensions. And yes, if we want that (for
whatever reasons I cannot think about right now).

Sorry, it must be too late :)

AFAIR, the opcode cache attempts a stat() call on the filename it
receives to get its mtime and use it to determine if file was modified
since it was cached. I don't understand how an opcode cache can do a stat()
on the filename you provide as, without a stream wrapper, there is no way
to access such information (which doesn't even have any sense here). Your
filename will be recognized as a plain filename, the libc stat() call will
look for this filename in the current directory, it won't be found, and
opcode caching will fail. If the file doesn't have an associated
stream-wrapped path, you cannot issue a stat() on it and I don't see how it
can be cached. Where am I wrong ?

Say we do support builtin scripts, an opcache will simply load them on
minit or on the first request and flag them as permanent. Yes, it means we
need to change opcache but could be way easier than trying to hack the
engine to support persistent Opcodes without ending with ... an opcache.

That's PHP7, we can and have already broken BC internally.

So, you confirm that opcache is the only supported opcode cache in PHP 7
? That's a good thing. I know that PHP7 breaks BC, I just wanted to know if
developments in this branch had to remain compatible with APC and friends.

It is almost already the case for 5.5+, apc is dead for anything >= 5.5

10 years ago by francois@tekwire.net — view source

unread

De : Pierre Joye [mailto:pierre.php@gmail.com]

Say we do support builtin scripts, an opcache will simply load them on minit or on the first request and flag them as
permanent. Yes, it means we need to change opcache but could be way easier than trying to hack the engine to support
persistent Opcodes without ending with ... an opcache.

OK. Agree. I just didn't understand how it was possible without any change to the cache code.

Personnally, if I implemented this, I would distribute a virtual file tree through a stream wrapper. The stream wrapper would be the same for all 'client' extensions. The fact to emulate a file tree would allow PHP scripts to reference each other using a path similar to what we do in packages : (dirname(FILE).'/relative/path'). We just need the stream wrapper to emulate '.' and '..' and the whole range of relative paths is available. This way, the script executes in an environment as familiar as possible. To avoid conflicts, each extension would have a separate root dir, something like '<protocol>://<extension-name>/'.

Off topic: I'd like to write a PHP7 RFC to extend the file system features whose behavior remains different for plain files and for stream wrappers. As most RFCs I write these days, this is an old subject but it was refused for security reasons, as there was no way to distinguish a 'remote' stream wrapper from a 'local' one. Now, the is_url flag allows to restrict dangerous features, like include_path or globbing, to 'local' wrappers, and some of these wrappers desperately need these features (primarily phar developers and me, actually :). If you find it crazy, please tell me before I write something more elaborate.

Thanks.

François

10 years ago by Pierre Joye — view source

unread

On Sat, Jan 10, 2015 at 11:07 PM, François Laupretre
francois@tekwire.net wrote:

De : Pierre Joye [mailto:pierre.php@gmail.com]

Say we do support builtin scripts, an opcache will simply load them on minit or on the first request and flag them as
permanent. Yes, it means we need to change opcache but could be way easier than trying to hack the engine to support
persistent Opcodes without ending with ... an opcache.

OK. Agree. I just didn't understand how it was possible without any change to the cache code.

Personnally, if I implemented this, I would distribute a virtual file tree through a stream wrapper. The stream wrapper would be the same for all 'client' extensions. The fact to emulate a file tree would allow PHP scripts to reference each other using a path similar to what we do in packages : (dirname(FILE).'/relative/path'). We just need the stream wrapper to emulate '.' and '..' and the whole range of relative paths is available. This way, the script executes in an environment as familiar as possible. To avoid conflicts, each extension would have a separate root dir, something like '<protocol>://<extension-name>/'.

For a pure builtin script for an extension point of view, I am not in
favour of allowing multiple files or add streams. The impact will be
too big in comparison to in-memory, cache&check only once, scripts.
Starting to redo phar for extension builtin scripts sound like an even
bigger can of worms.

Off topic: I'd like to write a PHP7 RFC to extend the file system features whose behavior remains different for plain files and for stream wrappers. As most RFCs I write these days, this is an old subject but it was refused for security reasons, as there was no way to distinguish a 'remote' stream wrapper from a 'local' one. Now, the is_url flag allows to restrict dangerous features, like include_path or globbing, to 'local' wrappers, and some of these wrappers desperately need these features (primarily phar developers and me, actually :). If you find it crazy, please tell me before I write something more elaborate.

Good idea, and the stream APIs need a cleanup as well (see the other
discussions about that too). One key point btw: portability. Adding
non portable features add more pains than gains.

--
Pierre

@pierrejoye | http://www.libgd.org

10 years ago by Benjamin Eberlei — view source

unread

On Sun, Jan 4, 2015 at 3:52 AM, Benjamin Eberlei kontakt@beberlei.de
wrote:

Hey everyone,

I want to open discussion on my RFC to strengthen the ability of
extensions
to provide functionality to developers in both C and PHP code.

For this extensions can add PHP files to a list of "prepend files" that
are
part of every request execution exactly the same way the INI
auto_prepend_file functionality works:

https://wiki.php.net/rfc/extension_prepend_files

I propose implementation details in the RFC, but they are completely up
to
discussion. I am even sure there is probably a better way than what I
proposed, because I am not familiar with the code.

Just for the record here:

A proof of concept, IRC log, I am lazy :)

<Pierre> benjamin, Derick
https://gist.github.com/pierrejoye/ce4867a5eaabffa71df4
https://gist.github.com/pierrejoye/0859e3702ceb3bb652b6
https://gist.github.com/pierrejoye/544e60d8994094c55583
<Pierre> too slow internet for a fork & PR
<Pierre> but it works now. Add PHP_BUILTIN_SCRIPT(date,
PHP_EXT_DIR(date)/date.php) to config.m4, and call manually
zend_execute_script in RINIT, could be easier to do it in a register
function, inside MINIT and let the engine do it on RINIT, actually
cleaner, but this patch is only a prrof of concept to play with

Cool thanks!

Re-citing the twitter discussion this requires opcache changes and Sara
preferred having a persistent function table.

Both things are a big undertaking for this rather small feature, even
considering Sara's post about a JNI/PNI extension API.

Cheers,

Pierre

@pierrejoye | http://www.libgd.org

[RFC] Extension Prepend Files

Regards,

You still need to address the files somehow if you plan to have more than one, which would eventually lead you to the need to namespace it since people tend to name their files "utils.php". Which would be essentially the same as directories. So I'm not sure I understand the win here.

Again, there is a multitude of solutions for this, all in the realm of packaging. It's not like we've just encountered the idea of software package having more than one file.

Cheers,

Cheers,

-- Lester Caine - G8HFL

Cheers,

Cheers,

Cheers,

Cheers,

Cheers,

Cheers,

Cheers,

Cheers,

Cheers,

You still need to address the files somehow if you plan to have more
than one, which would eventually lead you to the need to namespace it
since people tend to name their files "utils.php". Which would be
essentially the same as directories. So I'm not sure I understand the
win here.

Again, there is a multitude of solutions for this, all in the realm of
packaging. It's not like we've just encountered the idea of software
package having more than one file.

--
Lester Caine - G8HFL