Embedding PHP, few additional questions.

11 years ago by Ingwie Phoenix — view source

unread

Hey.

I am working on a nodejs module that allows to embed PHP (to use it within HTTP servers, for example). I wish to extend with a small thing. Instead of starting, running and shutting down the engine, I would like to keep it running, and just reset its state. Why? I think that this would keep my OPCache. Sadly, I did not find any documentation on how OPCache saved its…cache. Its just ment as a matter of speed improvement.

How can I just reset the engine, without shutting it down?

The embedding instructions that I will use are here: http://phi.lv/?p=376

Kind regards, Ingwie

11 years ago by johannes@schlueters.de — view source

unread

Hi,

I am working on a nodejs module that allows to embed PHP (to use it
within HTTP servers, for example). I wish to extend with a small
thing. Instead of starting, running and shutting down the engine, I
would like to keep it running, and just reset its state. Why? I
think that this would keep my OPCache. Sadly, I did not find any
documentation on how OPCache saved its…cache. Its just ment as a
matter of speed improvement.

How can I just reset the engine, without shutting it down?

I don't now what you mean by "reset" PHP is meant to serve requests, one
after another, where each request has its own context. Probably youwant
to run request cycles?

The embedding instructions that I will use are here:
http://phi.lv/?p=376

This sees to use the high level embed APIs. Use
php_request_startup(TSRMLS_C); ... php_request_shutdown((void *) 0);
style loops. See embed API implementation, and a simple SAPI like
https://github.com/johannes/pconn-sapi/blob/master/pconnect-sapi.c#L180

johannes

11 years ago by Ingwie Phoenix — view source

unread

Am 26.06.2014 um 12:14 schrieb Johannes Schlüter johannes@schlueters.de:

Hi,

I am working on a nodejs module that allows to embed PHP (to use it
within HTTP servers, for example). I wish to extend with a small
thing. Instead of starting, running and shutting down the engine, I
would like to keep it running, and just reset its state. Why? I
think that this would keep my OPCache. Sadly, I did not find any
documentation on how OPCache saved its…cache. Its just ment as a
matter of speed improvement.

How can I just reset the engine, without shutting it down?

I don't now what you mean by "reset" PHP is meant to serve requests, one
after another, where each request has its own context. Probably youwant
to run request cycles?

The embedding instructions that I will use are here:
http://phi.lv/?p=376

This sees to use the high level embed APIs. Use
php_request_startup(TSRMLS_C); ... php_request_shutdown((void *) 0);
style loops. See embed API implementation, and a simple SAPI like
https://github.com/johannes/pconn-sapi/blob/master/pconnect-sapi.c#L180

johannes

Hey Johannes.

With „reset“ I was meaning to reset the interpreter and internal VM - so that my OPCache would be preserved, but the used variables and such are unset, so the script can run again - but off the cached OP codes, instead of the script. Is that possible? And thanks, i will look into that link!

Kind regards, Ingwie.

11 years ago by Ingwie Phoenix — view source

unread

Am 26.06.2014 um 12:14 schrieb Johannes Schlüter johannes@schlueters.de:

Hi,

I am working on a nodejs module that allows to embed PHP (to use it
within HTTP servers, for example). I wish to extend with a small
thing. Instead of starting, running and shutting down the engine, I
would like to keep it running, and just reset its state. Why? I
think that this would keep my OPCache. Sadly, I did not find any
documentation on how OPCache saved its…cache. Its just ment as a
matter of speed improvement.

How can I just reset the engine, without shutting it down?

I don't now what you mean by "reset" PHP is meant to serve requests, one
after another, where each request has its own context. Probably youwant
to run request cycles?

The embedding instructions that I will use are here:
http://phi.lv/?p=376

This sees to use the high level embed APIs. Use
php_request_startup(TSRMLS_C); ... php_request_shutdown((void *) 0);
style loops. See embed API implementation, and a simple SAPI like
https://github.com/johannes/pconn-sapi/blob/master/pconnect-sapi.c#L180

johannes

Hey.

So I just looked into the pconn source and SAPI that you linked. Its basically easy, if there wasnt the rather confusing tsrm bits here and there. But I think I would just copy them…

I also did some research, and learned that OPCaches are stored in shared memory. Wouldn’t shared memory get lost when the interpreter is shut down, or how is this working? Is shared memory actually „fully associated“ to a program? Excuse me for the dumb question, It is just my first direct confrontation with shared memory, ever.

In your pconn SAPI, you are telling PHP that you are not going to use headers… but, in my case, I have an incomming HTTP request and should very much forward those headers. What is the function that I must look at, to copy my headers?

As this is going to be a nodejs module (v8php), I will link most of these functions into JS scope by supplying wrappers and the like. Going to be pretty interesting to see this in actual work, and comparign the speed to just using child_process.spawn.

Kind regards, Ingwie.

11 years ago by johannes@schlueters.de — view source

unread

I also did some research, and learned that OPCaches are stored in
shared memory. Wouldn’t shared memory get lost when the interpreter is
shut down, or how is this working? Is shared memory actually „fully
associated“ to a program? Excuse me for the dumb question, It is just
my first direct confrontation with shared memory, ever.

In general "shared memory" might mean many things. In this opcache case
you can assume that when PHP is shutdown (not only the request) the
shared memory is freed.

In your pconn SAPI, you are telling PHP that you are not going to use
headers… but, in my case, I have an incomming HTTP request and should
very much forward those headers. What is the function that I must look
at, to copy my headers?

php-src/sapi/ has quite a few examples and php-src/main/SAPI.c and
related files show implementation of details. If you're referring to the
no_headers flag this is about setting headers from PHP which would
otherwise be sent to the SAPI's header hooks ...

As this is going to be a nodejs module (v8php), I will link most of
these functions into JS scope by supplying wrappers and the like.
Going to be pretty interesting to see this in actual work, and
comparign the speed to just using child_process.spawn.

It will be worse - without really looking into child_process.spawn I
assume that is non-blocking. I assume you're PHP implementation will be
blocking and therefore hold all other things in nodejs. For making it
non-blocking you'd have to play a bit with threads and send PHP of to
worker threads and ensure it's compiled in TSRM mode which costs extra
time while executing. The better approach would be to use FastCGI / FPM
as communication with PHP. While I'd question the architecture of such
an system ...

johannes

11 years ago by Ingwie Phoenix — view source

unread

Am 27.06.2014 um 11:18 schrieb Johannes Schlüter johannes@schlueters.de:

I also did some research, and learned that OPCaches are stored in
shared memory. Wouldn’t shared memory get lost when the interpreter is
shut down, or how is this working? Is shared memory actually „fully
associated“ to a program? Excuse me for the dumb question, It is just
my first direct confrontation with shared memory, ever.

In general "shared memory" might mean many things. In this opcache case
you can assume that when PHP is shutdown (not only the request) the
shared memory is freed.
Ohh. So the interpreter needs to stay „online“ in order to also keep the cache up. Thanks, that made things clearer for me. :)

In your pconn SAPI, you are telling PHP that you are not going to use
headers… but, in my case, I have an incomming HTTP request and should
very much forward those headers. What is the function that I must look
at, to copy my headers?

php-src/sapi/ has quite a few examples and php-src/main/SAPI.c and
related files show implementation of details. If you're referring to the
no_headers flag this is about setting headers from PHP which would
otherwise be sent to the SAPI's header hooks …
So there is two methods of supplying headers? O.o oha. Well I will go and check the files out and see what it helps me.

As this is going to be a nodejs module (v8php), I will link most of
these functions into JS scope by supplying wrappers and the like.
Going to be pretty interesting to see this in actual work, and
comparign the speed to just using child_process.spawn.

It will be worse - without really looking into child_process.spawn I
assume that is non-blocking. I assume you're PHP implementation will be
blocking and therefore hold all other things in nodejs. For making it
non-blocking you'd have to play a bit with threads and send PHP of to
worker threads and ensure it's compiled in TSRM mode which costs extra
time while executing. The better approach would be to use FastCGI / FPM
as communication with PHP. While I'd question the architecture of such
an system ...

johannes

Yes. child_process.spawn is non-blocking - it just returns an Event Emitter and the stdin/stdout/stderr stream resources respectively. In fact, I want to model my PHP binding in such fashion too. Currently, I am firstly wanting to understand the API and create a good base for the module. Then extend it to be async/non-blocking. So, I will be replacing output and error output functions with respective callbacks. However, I am not planning to add any functionality to create userland functions from JS. I only want to be able to prepare a bunch of information - the file to be executed, the in-/output methods, etc - then give it to a function, which will create a new thread (in libuv, uv_async_t, or soemthing) and execute the given PHP file with the engine. The handle returned will contain the running PHP instance - or at least, i need to find a way to run PHP and then keep it up, so that my OPCaches wouldnt get lost.

But this brings me to another question.

When I look into pconnect SAPI, I can see that you start PHP…but actually, I can see no real reference to a „PHP interpreter object“. In a third-party implementation of PHP, called PH7, one has to create a ph7_engine_t first - which corresponds to the actual engine instance.

If I want to keep PHP running - and even multiple instances - how would I do that? If I have no handle to differenciate between PHP instances, then I have a small problem…and I would have to actually fall back to using PHP-CGI/-FPM, which I wanted to avoid, as I was hoping for a speed improvement if everything happened in-process/per thread.

Kind regards, Ingwie.

11 years ago by johannes@schlueters.de — view source

unread

So there is two methods of supplying headers? O.o oha. Well I will go

there are request and response headers and then there are different
situations the response headers might be set in ...

and check the files out and see what it helps me.

Yes, looking at the code is useful to understand it.

Yes. child_process.spawn is non-blocking - it just returns an Event
Emitter and the stdin/stdout/stderr stream resources respectively. In
fact, I want to model my PHP binding in such fashion too. Currently, I
am firstly wanting to understand the API and create a good base for
the module. Then extend it to be async/non-blocking. So, I will be
replacing output and error output functions with respective callbacks.
However, I am not planning to add any functionality to create userland
functions from JS. I only want to be able to prepare a bunch of
information - the file to be executed, the in-/output methods, etc -
then give it to a function, which will create a new thread (in libuv,
uv_async_t, or soemthing) and execute the given PHP file with the
engine. The handle returned will contain the running PHP instance - or
at least, i need to find a way to run PHP and then keep it up, so that
my OPCaches wouldnt get lost.

Forget Opcache. Get the PHP model of the request based approach which
goes through PHP's complete architecture. opcache comes in in the end
(more or less) easily.

But this brings me to another question.

When I look into pconnect SAPI, I can see that you start PHP…but
actually, I can see no real reference to a „PHP interpreter object“.
In a third-party implementation of PHP, called PH7, one has to create
a ph7_engine_t first - which corresponds to the actual engine
instance.

If I want to keep PHP running - and even multiple instances - how
would I do that? If I have no handle to differenciate between PHP
instances, then I have a small problem…and I would have to actually
fall back to using PHP-CGI/-FPM, which I wanted to avoid, as I was
hoping for a speed improvement if everything happened in-process/per
thread.

As said before the request context is defined by request startup and
shutdown functions. This works by using globals to keep the state. In
threaded environments, to run multiple requests in parallel we have TSRM
as thread isolation layer binding a request context to a thread. Thus
the request is bound to a thread. Once it's finished it can handle the
next request.
My pconn SAPI supports some form of parallelism, too. check the main.c
file.

But for your project this becomes really messy - a "request" comes in,
you start a worker thread fill it with the data. The output hooks then
have to be written in a way to capture the output and send it back to
the v8 thread which will, once it has time handle those and report back
to the javascript code. Now if you want to make it efficient you can't
create a fresh thread each time but instead want to create a worker pool
of threads and some scheduler putting tasks in.

Then besides that complexity you have to mind other issues in this
setup. for instance TSRM mode is slower than non TSRM mode. Or as
everthing is one process. A crashing PHP (stack overflow due to infinite
recursion) will kill all parallel PHP requests and the whole node
instance.

So really if you plan on operating such a thing write a FastCGI or FPM
module calling PHP in such a way. Or even better: Make your PHP code a
web service (REST etc.) so both sides can be scaled individually.

If you want to do this for academic purposes or proving me wrong, please
read the code and try to understand it. Experiment with it in small
pieces etc.

johannes

11 years ago by Ingwie Phoenix — view source

unread

Am 27.06.2014 um 16:20 schrieb Johannes Schlüter johannes@schlueters.de:

So there is two methods of supplying headers? O.o oha. Well I will go

there are request and response headers and then there are different
situations the response headers might be set in …
Whoah, so many options... o-o

and check the files out and see what it helps me.

Yes, looking at the code is useful to understand it.

Yes. child_process.spawn is non-blocking - it just returns an Event
Emitter and the stdin/stdout/stderr stream resources respectively. In
fact, I want to model my PHP binding in such fashion too. Currently, I
am firstly wanting to understand the API and create a good base for
the module. Then extend it to be async/non-blocking. So, I will be
replacing output and error output functions with respective callbacks.
However, I am not planning to add any functionality to create userland
functions from JS. I only want to be able to prepare a bunch of
information - the file to be executed, the in-/output methods, etc -
then give it to a function, which will create a new thread (in libuv,
uv_async_t, or soemthing) and execute the given PHP file with the
engine. The handle returned will contain the running PHP instance - or
at least, i need to find a way to run PHP and then keep it up, so that
my OPCaches wouldnt get lost.

Forget Opcache. Get the PHP model of the request based approach which
goes through PHP's complete architecture. opcache comes in in the end
(more or less) easily.
What do you mean by that? It might be my english, but I did not very much understand what you ment.

But this brings me to another question.

When I look into pconnect SAPI, I can see that you start PHP…but
actually, I can see no real reference to a „PHP interpreter object“.
In a third-party implementation of PHP, called PH7, one has to create
a ph7_engine_t first - which corresponds to the actual engine
instance.

If I want to keep PHP running - and even multiple instances - how
would I do that? If I have no handle to differenciate between PHP
instances, then I have a small problem…and I would have to actually
fall back to using PHP-CGI/-FPM, which I wanted to avoid, as I was
hoping for a speed improvement if everything happened in-process/per
thread.

As said before the request context is defined by request startup and
shutdown functions. This works by using globals to keep the state. In
threaded environments, to run multiple requests in parallel we have TSRM
as thread isolation layer binding a request context to a thread. Thus
the request is bound to a thread. Once it's finished it can handle the
next request.
My pconn SAPI supports some form of parallelism, too. check the main.c
file.
Actually, I am indeed scrolling thru pconn. It currently has the most clean API showcase that I can see, to be very honest. But now, you actually finally helped me understand soemthing that I always wondered: the TSRM_xx macros! Now it makes sense why they are everywhere, acting as a sort-of mutex…or more, a thing to differenciate between threads. Always got confused about how this actually is used - well, now I know. A lesson a day, keeps the dumbness away, methinks! ^.^

But for your project this becomes really messy - a "request" comes in,
you start a worker thread fill it with the data. The output hooks then
have to be written in a way to capture the output and send it back to
the v8 thread which will, once it has time handle those and report back
to the javascript code. Now if you want to make it efficient you can't
create a fresh thread each time but instead want to create a worker pool
of threads and some scheduler putting tasks in.

Then besides that complexity you have to mind other issues in this
setup. for instance TSRM mode is slower than non TSRM mode. Or as
everthing is one process. A crashing PHP (stack overflow due to infinite
recursion) will kill all parallel PHP requests and the whole node
instance.
Oh s… That I did not know/consider. So if, for some reason, the PHP instance crashes or dies, all other instances, and possibly the rest of the application, dies too? Wow. Guess I should create a watchdog that keeps track of the service.

So really if you plan on operating such a thing write a FastCGI or FPM
module calling PHP in such a way. Or even better: Make your PHP code a
web service (REST etc.) so both sides can be scaled individually.
The goal is to run websockets, a cdn and db cache together with HTTP(S). So turning into REST based service would actually make the structure more complex. Actually, I can show you the concept documentation. It is still in progress, but it lists quite a lot of the concepts, ideas and things I will/want to do. Maybe it helps why I am even doing this. 1

If you want to do this for academic purposes or proving me wrong, please
read the code and try to understand it. Experiment with it in small
pieces etc.

johannes
I am reading all the code that I can now, that is related to what I am doing. I just noticed that the PHP on OS X has no maintainer-zts. So, I am compiling 5.5.14 with embed and maintainer-zts meanwhile…

Thanks for all your help, it really brings me forward a lot!

Kind regards, Ingwie.

[2]: https://github.com/IngwiePhoenix/v8php The repo where I will put my stuff…very rough layout, not usable. Just there to be there.

11 years ago by johannes@schlueters.de — view source

unread

Forget Opcache. Get the PHP model of the request based approach
which
goes through PHP's complete architecture. opcache comes in in the
end
(more or less) easily.
What do you mean by that? It might be my english, but I did not very
much understand what you ment.

You are asking "How can I use opcache in some environment, which I
actually don't know" while the actual question should be "What's PHP's
internal architecture especially in regards to SAPIs". Opcache fits in,
in the end, once the architecture is understood.

Actually, I am indeed scrolling thru pconn. It currently has the most
clean API showcase that I can see, to be very honest.

It does a trivial thing, has neither user interaction, nor external
systems to interact with. That's one I like to show it as example for
these questions. While it's not perfect. But those are details.

But now, you actually finally helped me understand soemthing that I
always wondered: the TSRM_xx macros! Now it makes sense why they are
everywhere, acting as a sort-of mutex…or more, a thing to
differenciate between threads. Always got confused about how this
actually is used - well, now I know. A lesson a day, keeps the
dumbness away, methinks! ^.^

"thread safe resource manager" ... again try to dive in, into the
architecture of PHP.

Oh s… That I did not know/consider. So if, for some reason, the PHP
instance crashes or dies, all other instances, and possibly the rest
of the application, dies too? Wow. Guess I should create a watchdog
that keeps track of the service.

Well each service should be watched. On good operating systems (note:
Solaris ad coming up) you can define contracts in regards to processes
the kernel should watch as soon as they misbehave a service manager will
be notified and can take action. But well, that's a generic issue, also
nodejs/v8 can crash in some situations etc.
For a system design the question is: How can you limit the risk and
impact. With your design everything is lost and has to be restarted.

The goal is to run websockets, a cdn and db cache together with
HTTP(S). So turning into REST based service would actually make the
structure more complex.

You seem to have a different definition of "complex" than me. I consider
a monolithic block, mixing multiple language runtimes which follow
completely different paradigms ("stay away from threads an use events
and be light weight" vs. "have a shared nothing conatainer handling one
request after another") and which both are quite sophisticated to be
quite complex, whereas I consider an architecture of smaller independent
components, which I can maintain, scale, ... independently cleaner.

johannes

11 years ago by Ingwie Phoenix — view source

unread

Am 27.06.2014 um 17:29 schrieb Johannes Schlüter johannes@schlueters.de:

Forget Opcache. Get the PHP model of the request based approach
which
goes through PHP's complete architecture. opcache comes in in the
end
(more or less) easily.
What do you mean by that? It might be my english, but I did not very
much understand what you ment.

You are asking "How can I use opcache in some environment, which I
actually don't know" while the actual question should be "What's PHP's
internal architecture especially in regards to SAPIs". Opcache fits in,
in the end, once the architecture is understood.
Okay. I will be looking into the source of PHP and documentation, going to try and understand it further. Looks like, that what I have learned so far, is like just 1% of what I really need to know - kinda scary, but also a challenge I wish to take on.

But now, you actually finally helped me understand soemthing that I
always wondered: the TSRM_xx macros! Now it makes sense why they are
everywhere, acting as a sort-of mutex…or more, a thing to
differenciate between threads. Always got confused about how this
actually is used - well, now I know. A lesson a day, keeps the
dumbness away, methinks! ^.^

"thread safe resource manager" ... again try to dive in, into the
architecture of PHP.
Well, I tried to understand all the odd TSRM macros before, but I couldnt find a real answer - just that I had to put them into certain places, but never why. Now it just clicks in and makes sense. Sadly, some parts of PHP are not documented - or, the documentation is very outdated… (I.e. I found an embedding guide for PHP 4 before finally finding one for 5.)

Oh s… That I did not know/consider. So if, for some reason, the PHP
instance crashes or dies, all other instances, and possibly the rest
of the application, dies too? Wow. Guess I should create a watchdog
that keeps track of the service.

Well each service should be watched. On good operating systems (note:
Solaris ad coming up) you can define contracts in regards to processes
the kernel should watch as soon as they misbehave a service manager will
be notified and can take action. But well, that's a generic issue, also
nodejs/v8 can crash in some situations etc.
For a system design the question is: How can you limit the risk and
impact. With your design everything is lost and has to be restarted.
Now that you notified me of this thing that I did not think about, I will have to really make a new design. I have not used Solaris, but I heared many good things. Currently I have stuck to Debian - but this enhancement sounds like one point to consider switching distros. I am getting a new server anyway, good opportunity to learn new things. o.o

The goal is to run websockets, a cdn and db cache together with
HTTP(S). So turning into REST based service would actually make the
structure more complex.

You seem to have a different definition of "complex" than me. I consider
a monolithic block, mixing multiple language runtimes which follow
completely different paradigms ("stay away from threads an use events
and be light weight" vs. "have a shared nothing conatainer handling one
request after another") and which both are quite sophisticated to be
quite complex, whereas I consider an architecture of smaller independent
components, which I can maintain, scale, ... independently cleaner.

johannes

It’s definitively not the first time somebody said that…haha. I know that PHP and NodeJS have by -far- different ideas of how things should be handled and worked. But currently available webservers dont support Websockets too well. So my last solution was to use nodejs directly. And since I am often on environments where ports other than 80 are blocked (my school, for example). So I was looking for a way to use it all thru one port, and many things behind the scenes. Which in the end made me end up with getting to the point, where I needed to run PHP underneath nodejs. Its not truely a „language-to-language“ binding - its just giving PHP something to process and return the output. The both are completely isolated - except the fact that they are stuck in the same memory/program space - and dont really interact with each other. Basically, I just want to write a SAPI under nodejs, simialr to Apache. And that is why I am currently sitting here, with mod_php5.c open.

Kind regards, Ingwie.