Hello PHP folks,
I've seen this discussed previously, and would like to add a few
arguments for the multi-threading side vs. async processing,
which I've seen mentioned as a viable alternative:
1. Imagine that from time to time, some background processing takes 1
second of CPU time - w/o multithreading, all your async operations,
like accepting a connection to a socket, aio or others are basically
stalled. So, async is a good approach, but does not work as a magic
wand for all problem spaces. Alternatively, you could fork and then do the
processing, but then the state syncing of the forked background processing
results with the main thread requires a whole new protocol / switching to
interprocess communication, which makes such developments unnecessarily
hard. Threads exist for a _reason_ not only because they sound cool.
2. Without thread support, it is not possible to use multi-core processing
paradigms directly from PHP - which means PHP relies on external frameworks for
that feature, which, in a sense, makes it a non-general-purpose language.
It _could become_ a general purpouse tool, if it had proper multi-threading
support built-in.
I, personally, considered developing websockets / nanoserv server stack with PHP
and bumped into the multithreading limitation - AFAIK it is the only big feature
separating PHP from the general purpouse languages. Everything else is well
integrated with lots of external libraries/modules giving rise to potential rapid
application development while using it.
Cheers and let me know about your thoughts, and potential core implementation
issues regarding developing this language feature.
--
Best regards,
speedy mailto:speedy.spam@gmail.com
1. Imagine that from time to time, some background processing takes 1 second of CPU time - w/o multithreading, all your async operations, like accepting a connection to a socket, aio or others are basically stalled. So, async is a good approach, but does not work as a magic wand for all problem spaces. Alternatively, you could fork and then do the processing, but then the state syncing of the forked background processing results with the main thread requires a whole new protocol / switching to interprocess communication, which makes such developments unnecessarily hard. Threads exist for a _reason_ not only because they sound cool.
There are plenty of mechanisms/protocols for doing this. Gearman works
extremely well for managing out-of-band jobs like this, for example.
2. Without thread support, it is not possible to use multi-core processing paradigms directly from PHP - which means PHP relies on external frameworks for that feature, which, in a sense, makes it a non-general-purpose language. It _could become_ a general purpouse tool, if it had proper multi-threading support built-in.
PHP is not a general-purpose language and there are no plans to make it
one. Your OS provides scheduling and is responsible for making best use
of your multiple cores. With many concurrent web requests your multiple
cores should be put to good use.
-Rasmus
Hello Rasmus,
Thursday, April 1, 2010, 5:21:55 PM, you wrote:
1. Imagine that from time to time, some background processing takes 1 second of CPU time - w/o multithreading, all your async operations, like accepting a connection to a socket, aio or others are basically stalled. So, async is a good approach, but does not work as a magic wand for all problem spaces. Alternatively, you could fork and then do the processing, but then the state syncing of the forked background processing results with the main thread requires a whole new protocol / switching to interprocess communication, which makes such developments unnecessarily hard. Threads exist for a _reason_ not only because they sound cool.
There are plenty of mechanisms/protocols for doing this. Gearman works
extremely well for managing out-of-band jobs like this, for example.
Yes, but sometimes, esp. in more real-time oriented applications, state merging is
very hard to do - though I agree that protocols and mechanisms exist for out of
band processing, there are plenty of cases in which they are suboptimal.
2. Without thread support, it is not possible to use multi-core processing paradigms directly from PHP - which means PHP relies on external frameworks for that feature, which, in a sense, makes it a non-general-purpose language. It _could become_ a general purpouse tool, if it had proper multi-threading support built-in.
PHP is not a general-purpose language and there are no plans to make it
one. Your OS provides scheduling and is responsible for making best use
of your multiple cores. With many concurrent web requests your multiple
cores should be put to good use.
Current situational facts are against that claim - with PHP delivered as a CLI
interpreter, and via that route, having more and more software applications written
outside the web serving scope/domain, PHP has spontaneously made strides into
becoming a general purpouse tool.
I think the only remaining step in that direction is to have the native
multi-threading support. I plan to check out how hard would that be, and see how
many people would be needed for developing that feature in some reasonable time-frame.
Also, keep in mind that the web is slowly shifting towards real-time communication
/ streaming with emergence of Comet, HTML5 Web Sockets etc. There are already many web
server implementations specialising in that, and PHP is not their language of choice.
-Rasmus
--
Best regards,
speedy mailto:speedy.spam@gmail.com
Also, keep in mind that the web is slowly shifting towards real-time communication
/ streaming with emergence of Comet, HTML5 Web Sockets etc. There are already many web
server implementations specialising in that, and PHP is not their language of choice.
The large comet implementations I know about use a high performance
proxy to hold hundreds of thousands of open sockets and then they have a
dispatching mechanism to route individual requests to the various
backend processing engines. PHP is very much part of that picture, and
again, given that the concurrency is invariably going to be high,
multi-core usage is handled by your OS.
In any sort of Web architecture native threading in PHP just doesn't
make any sense. In single monolithic CLI apps, you could make a case
for it, but that is not the sort of architecture we are going to put a
significant amount of time into.
-Rasmus
Hello Rasmus,
Thursday, April 1, 2010, 6:16:21 PM, you wrote:
In any sort of Web architecture native threading in PHP just doesn't
make any sense.
Imagine a real-time websockets/HTTP based server processing architecture with
quick event passing from one connection to another with possibility of each web
client broadcasting events to all other web clients. A simple example: web chat,
and a complex one: real-time web MMO game.
Now imagine a whole web server written in PHP (ie. nanoserv), say, using
libevent as the network backend, running the above described real-time web
implementation. Alternatively, you could perhaps even wire it into worker/event
model of apache/other servers instead of rolling your own. It sounds quite powerful,
and development-effort-wise cheap - out of a mere HTML preprocessor!
With proper threading, this would be a piece of cake to implement in PHP, efficiently
and ensuring low latency, using up all available CPU cores / resources.
Without using native threads, the whole class of web architectures on both server
and processing levels, viewed both separately or together, are quite a bit more hairy
to implement.
In single monolithic CLI apps, you could make a case
for it, but that is not the sort of architecture we are going to put a
significant amount of time into.
Yep, agreed and understood.
-Rasmus
--
Best regards,
speedy mailto:speedy.spam@gmail.com
Now imagine a whole web server written in PHP (ie. nanoserv), say, using
libevent as the network backend, running the above described real-time web
implementation. Alternatively, you could perhaps even wire it into
worker/event
model of apache/other servers instead of rolling your own. It sounds quite
powerful,
and development-effort-wise cheap - out of a mere HTML preprocessor!With proper threading, this would be a piece of cake to implement in PHP,
efficiently
and ensuring low latency, using up all available CPU cores / resources.Without using native threads, the whole class of web architectures on both
server
and processing levels, viewed both separately or together, are quite a bit
more hairy
to implement.
Sorry, but as a performance-oriented C developer who has written a good
amount of stuff directly against libevent, I'm having a hard time not
laughing at this thread. Why would anyone want to write such a thing
directly in PHP, or any other scripting language for that matter? If you're
worried about performance, you're going to have to do it in something either
low-level or that is natively compiled. In the scripting language world, I
can only think of something like Lua with libevent bindings as being capable
of performing fairly well in a multi-threaded environment (given no GIL).
Though, even in that case, you'd still have a good amount of custom work to
do.
I've worked with libevent-based Comet servers written in C, and there's no
way a scripting language version is going to handle memory management,
messaging, and queueing for a large number of connections effectively.
Similarly, you'll never really be able to approach near-real-time processing
in this type of environment given basic
abstraction/interpretation/scheduling issues.
These are very basic systems architecture issues and it's obvious that
scripting languages are not designed for what you're asking. Not to sound
demeaning, but do you have any real need for what you're asking, or are you
just saying, "gee, it would be nice for me to be able to write a Comet
server in 99 lines of PHP that is capable of managing 60K concurrent
connections and a rate of 15K messages/second with a maximum latency of
50ms?"
Sorry, I'm a little grumpy today...
--
Jonah H. Harris
Blog: http://www.oracle-internals.com/
Now imagine a whole web server written in PHP (ie. nanoserv), say, using
libevent as the network backend, running the above described real-time web
implementation. Alternatively, you could perhaps even wire it into worker/event
model of apache/other servers instead of rolling your own. It sounds quite powerful,
and development-effort-wise cheap - out of a mere HTML preprocessor!
PHP works fine in Apahce worker now if you know what you are doing and
don't do stupid things with PHP in a web environment. Keep all the
stupid stuff in Gearman, cron or non-threaded servers.
This is just not what PHP is for.
Brian.
Hi!
processing, but then the state syncing of the forked background processing results with the main thread requires a whole new protocol / switching to interprocess communication, which makes such developments unnecessarily hard. Threads exist for a _reason_ not only because they sound cool.
Interesting thing here threads would require (at least if they are
implemented the way C, etc. threads usually work) whole new protocol of
synchronization too. Just think about something: do you have shared
classes/variables/etc.?
If you do, how you control access to them? Hello locks and the whole can
of worms! Most people that think they can program in threads actually
are just pushing their luck until some complex interaction leaves their
app malfunctioning in a bizarre way and them without any means to debug
it. I mean, you can do it right, but it's usually harder than it looks.
Share-nothing exists for a reason too :)
If you don't, how it's different from forking except that you need to
work 10 times as hard to keep engine guts from entangling in two "threads"?
2. Without thread support, it is not possible to use multi-core processing paradigms directly from PHP - which means PHP relies on external frameworks for
What kinds of paradigms those are? in 90% of cases I saw if you need
real multicore involvement in PHP you either doing it in a wrong place
(i.e. webserver vs. backend) or doing it in a wrong language.
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
If you do, how you control access to them? Hello locks and the whole can
of worms! Most people that think they can program in threads actually
are just pushing their luck until some complex interaction leaves their
app malfunctioning in a bizarre way and them without any means to debug
it.
Well, I don't agree with the statement above, but I do believe that if you're competent
enough to use threads, then you're competent enough to write in C/C++.
Not to mention that a high-performance multi-threaded daemon written in PHP
is a total nonsense - that's exactly the task C/C++ do much better/faster.
--
Wbr,
Antony Dovgal
http://pinba.org - realtime statistics for PHP
are we talking about having a thread safe PHP or parallel-like
features available in user land?
The sooner needs some love to be actually true, while most of the
issues come from external libs. The later makes little or no sense.
If you do, how you control access to them? Hello locks and the whole can
of worms! Most people that think they can program in threads actually
are just pushing their luck until some complex interaction leaves their
app malfunctioning in a bizarre way and them without any means to debug
it.Well, I don't agree with the statement above, but I do believe that if you're competent
enough to use threads, then you're competent enough to write in C/C++.Not to mention that a high-performance multi-threaded daemon written in PHP
is a total nonsense - that's exactly the task C/C++ do much better/faster.--
Wbr,
Antony Dovgalhttp://pinba.org - realtime statistics for PHP
--
--
Pierre
@pierrejoye | http://blog.thepimp.net | http://www.libgd.org
are we talking about having a thread safe PHP or parallel-like
features available in user land?The sooner needs some love to be actually true, while most of the
issues come from external libs. The later makes little or no sense.
We're talking about threads in userspace, of course.
--
Wbr,
Antony Dovgal
http://pinba.org - realtime statistics for PHP
Pierre Joye wrote:
are we talking about having a thread safe PHP or parallel-like
features available in user land?The sooner needs some love to be actually true, while most of the
issues come from external libs. The later makes little or no sense.
Something we agree on Pierre.
The first piece of the jigsaw IS to make everything thread safe.
And perhaps link to internally threaded functions such as asynchronous queries.
Then a discussion about user land multi-threading may make more sense, but I
think Rasmus has stated the case quite succinctly as to why that does not make
sense ;)
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php
Hello Antony,
Thursday, April 1, 2010, 8:52:13 PM, you wrote:
Not to mention that a high-performance multi-threaded daemon written in PHP
is a total nonsense - that's exactly the task C/C++ do much better/faster.
I'd like to have easily written, decently performing multi-threaded
daemon.
And the things in between here and there are quite hard to do, but still doable:
- native multithreading (my uneducated guess: copy on write / rw-sems on the
internals and mutex objects on the language level) - PHP JIT (via LLVM or similar)
I do agree with the parent poster that it is a can of worms which can
easily shoot you in the foot ;) Some kind of php.ini or #pragma-alike
threading directive for the programmer to "unlock" it him/herself
would be really smart to be had.
--
Wbr,
Antony Dovgalhttp://pinba.org - realtime statistics for PHP
--
Best regards,
speedy mailto:speedy.spam@gmail.com
Hi!
processing, but then the state syncing of the forked background processing results with the main thread requires a whole new protocol / switching to interprocess communication, which makes such developments unnecessarily hard. Threads exist for a _reason_ not only because they sound cool.
Interesting thing here threads would require (at least if they are implemented the way C, etc. threads usually work) whole new protocol of synchronization too. Just think about something: do you have shared classes/variables/etc.?
If you do, how you control access to them? Hello locks and the whole can of worms! Most people that think they can program in threads actually are just pushing their luck until some complex interaction leaves their app malfunctioning in a bizarre way and them without any means to debug it. I mean, you can do it right, but it's usually harder than it looks. Share-nothing exists for a reason too :)
Well, strictly speaking, there are [at least] 2 models which can be used:
- Classical with shared resources and locking
- STM
Anyway, do we really have to tell people "you don't need it" when they believe that they do?
Python has multithreading and it works reasonably good. People who know what they are doing can implement really brilliant solutions (think Tornado)
And something like GIL feels like a reasonable compromise to me.
Anyway, do we really have to tell people "you don't need it" when they
believe that they do?
Python has multithreading and it works reasonably good. People who know
what they are doing can implement really brilliant solutions (think Tornado)Interesting thing here threads would require (at least if they are
implemented the way C, etc. threads usually work) whole new protocol of
synchronization too. Just think about something: do you have shared
classes/variables/etc.?
Eve Online in Stackless Python
fmspy.org with stackless python
etc.
And something like GIL feels like a reasonable compromise to me.
I think PHP is the last from the major scripting languages on the market
without some kind of thread support.
(Cannot find a good comparsion, but Ruby, Python, Perl does some)
Tyrael
Hi!
Eve Online in Stackless Python
fmspy.org http://fmspy.org with stackless python
etc.
I don't know how python does it but PHP has a lot of global context, and
sharing this global context between threads, whatever they are (OS
threads, user-space threads, etc.) would be massively complex thing tp
manage either for the engine or for the user.
And something like GIL feels like a reasonable compromise to me.
GIL would mean dreams about multicore stuff are just dreams, but beyond
that even with GIL, what do you do if you checked for class_exists and
it didn't exist, and you're about to load it when other thread loads it
before? Etc., etc. You'd have to spend a lot of time thinking about
stuff like that in your code. The fact that GIL protects you from
C-level context changes doesn't mean it'd protect you from PHP-level
context changes, like some code using some data structure and other code
changing it (on C level it'd be ok - no memory corruption, etc. - but on
PHP level it might totally break your code).
It's not to say it can't be done, but I didn't see yet coherent proposal
for any threads implementation that had also good use cases.
I think PHP is the last from the major scripting languages on the market
without some kind of thread support.
(Cannot find a good comparsion, but Ruby, Python, Perl does some)
And "keeping up with Joneses" is not a good use case.
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
Hi!
Eve Online in Stackless Python
fmspy.org http://fmspy.org with stackless python
etc.I don't know how python does it but PHP has a lot of global context, and
sharing this global context between threads, whatever they are (OS threads,
user-space threads, etc.) would be massively complex thing tp manage either
for the engine or for the user.
Let make it slightly more clear: To do what python or perl does using
the ZendEngine2 would mean a complete rewrite of the engine. It is a
short answer but reflects the fact, sadly.
Cheers,
Pierre
@pierrejoye | http://blog.thepimp.net | http://www.libgd.org
Hi!
Eve Online in Stackless Python
fmspy.org http://fmspy.org with stackless python
etc.I don't know how python does it but PHP has a lot of global context, and
sharing this global context between threads, whatever they are (OS
threads,
user-space threads, etc.) would be massively complex thing tp manage
either
for the engine or for the user.Let make it slightly more clear: To do what python or perl does using
the ZendEngine2 would mean a complete rewrite of the engine. It is a
short answer but reflects the fact, sadly.
Thats a sad thing. :(
Any volunteers to do that? ;)
Thank you for pointing that out.
Tyrael
Cheers,
Pierre
@pierrejoye | http://blog.thepimp.net | http://www.libgd.org
I use pcntl_fork()
for writing parallel multi-process applications and it
works pretty well.
Also, you can use shared memory queues to pass messages between processes
(ie msg_get_queue()
).
I wrote a little proof of concept library a while ago to demonstrate:
http://github.com/dhotson/Phork
It's not that complicated... the main part of it is only ~110 lines of code:
http://github.com/dhotson/Phork/blob/master/classes/Phork/Process.php
Regards,
Dennis
Hello PHP folks,
I've seen this discussed previously, and would like to add a few arguments for the multi-threading side vs. async processing, which I've seen mentioned as a viable alternative: 1. Imagine that from time to time, some background processing takes 1 second of CPU time - w/o multithreading, all your async operations, like accepting a connection to a socket, aio or others are basically stalled. So, async is a good approach, but does not work as a magic wand for all problem spaces. Alternatively, you could fork and then do
the
processing, but then the state syncing of the forked background
processing
results with the main thread requires a whole new protocol / switching
to
interprocess communication, which makes such developments
unnecessarily
hard. Threads exist for a reason not only because they sound cool.2. Without thread support, it is not possible to use multi-core
processing
paradigms directly from PHP - which means PHP relies on external
frameworks for
that feature, which, in a sense, makes it a non-general-purpose
language.
It could become a general purpouse tool, if it had proper
multi-threading
support built-in.I, personally, considered developing websockets / nanoserv server
stack with PHP
and bumped into the multithreading limitation - AFAIK it is the only
big feature
separating PHP from the general purpouse languages. Everything else is
well
integrated with lots of external libraries/modules giving rise to
potential rapid
application development while using it.Cheers and let me know about your thoughts, and potential core
implementation
issues regarding developing this language feature.--
Best regards,
speedy mailto:speedy.spam@gmail.com
Hi,
I think that if we were ever to implement threading we would be best off
to enable spawning worker threads that have their own context with no
shared data (and therefore no requirement for locking). We could then
have a message passing API between the threads.
Advantages:
- Real multi-threading.
- Simple straightforward approach which doesn't require a comp. sci.
degree to use correctly. - Very stable implementation.
You can tell by this that:
a) I think GIL is not the way to go. It's more complex, not truly
multi-threaded, and implementation may never be 100%.
b) True multi-threading with data sharing in my opinion is a recipe for
disaster. Not only because of the implementation complexities but I
think it makes life very hard for the developer and requires a lot of
sophistication.
So if I'd implement something I'd definitely do true multi-threading
with message passing (we basically already have the infrastructure with
TSRM to support that). Do I think this is a high priority item? Not
really but I can understand that it could add value to some. I think for
some Web requests it could actually allow for some parallel processing
with a rendezvous which could help reduce latency (not overall server
throughput). Then again, there'd be some overhead for having a
thread-safe build so I don't see this as something that would be enabled
by the masses - at least not initially.
Andi
Op 2-4-2010 7:16, Andi Gutmans schreef
I think that if we were ever to implement threading we would be best off
to enable spawning worker threads that have their own context with no
shared data (and therefore no requirement for locking). We could then
have a message passing API between the threads.
Advantages:
- Real multi-threading.
- Simple straightforward approach which doesn't require a comp. sci.
degree to use correctly.- Very stable implementation.
That sounds like "I want threading; because it sounds cool!". What are
the advantages of this above multi-process?
The systemcall-overhead for message passing?
And why did nobody mention Aprils Fools yesterday; when the
request-for-threading was sent ;)
-- Jille
Jille Timmermans wrote:
Op 2-4-2010 7:16, Andi Gutmans schreef
I think that if we were ever to implement threading we would be best off
to enable spawning worker threads that have their own context with no
shared data (and therefore no requirement for locking). We could then
have a message passing API between the threads.
Advantages:
- Real multi-threading.
- Simple straightforward approach which doesn't require a comp. sci.
degree to use correctly.- Very stable implementation.
That sounds like "I want threading; because it sounds cool!". What are
the advantages of this above multi-process?
The systemcall-overhead for message passing?
Actually Andi's outline forms a nice simple base for something practical. It
simply builds on the 'background' threading required to run asynchronous
operations while not creating a unmanageable mess. But I still can't see any
need to go beyond perhaps asynchronous SQL queries.
It still requires that all the non-thread safe code is addressed first? Even if
that simply means disabling extensions that are not safe?
And why did nobody mention Aprils Fools yesterday; when the
request-for-threading was sent ;)
Because it was after noon when it was sent ;)
--
Lester Caine - G8HFL
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php
Jille Timmermans wrote:
Op 2-4-2010 7:16, Andi Gutmans schreef
I think that if we were ever to implement threading we would be best off
to enable spawning worker threads that have their own context with no
shared data (and therefore no requirement for locking). We could then
have a message passing API between the threads.
Advantages:
- Real multi-threading.
- Simple straightforward approach which doesn't require a comp. sci.
degree to use correctly.- Very stable implementation.
That sounds like "I want threading; because it sounds cool!". What are
the advantages of this above multi-process?
The systemcall-overhead for message passing?Actually Andi's outline forms a nice simple base for something practical.
It simply builds on the 'background' threading required to run asynchronous
operations while not creating a unmanageable mess. But I still can't see any
need to go beyond perhaps asynchronous SQL queries.Or asynchronous exec, or asynchronous(or at least timout aware)
gethostbyaddr, see:
http://bugs.php.net/bug.php?id=51306
So any task, that require waiting on external resource could be executed in
paralel.
http://hu2.php.net/manual/en/mysqli.reap-async-query.php
its a good thing, that you can async mysql execution with mysqlnd.
Tyrael
It still requires that all the non-thread safe code is addressed first?
Even if that simply means disabling extensions that are not safe?And why did nobody mention Aprils Fools yesterday; when the
request-for-threading was sent ;)
Because it was after noon when it was sent ;)
--
Lester Caine - G8HFLContact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php
Op 2-4-2010 7:16, Andi Gutmans schreef
I think that if we were ever to implement threading we would be best off
to enable spawning worker threads that have their own context with no
shared data (and therefore no requirement for locking). We could then
have a message passing API between the threads.
Advantages:
- Real multi-threading.
- Simple straightforward approach which doesn't require a comp. sci.
degree to use correctly.- Very stable implementation.
That sounds like "I want threading; because it sounds cool!".
Threading is cool. :) It has more value for me, that traits/grafts but I
won't deny, that others may find it usefull, why are you?
What are the advantages of this above multi-process?
The systemcall-overhead for message passing?
I mentioned a few advantage in one of my previous post,
Tyrael
Hi,
I think that if we were ever to implement threading we would be best off
to enable spawning worker threads that have their own context with no
shared data (and therefore no requirement for locking). We could then
have a message passing API between the threads.
Advantages:
- Real multi-threading.
- Simple straightforward approach which doesn't require a comp. sci.
degree to use correctly.- Very stable implementation.
You can tell by this that:
a) I think GIL is not the way to go. It's more complex, not truly
multi-threaded, and implementation may never be 100%.
b) True multi-threading with data sharing in my opinion is a recipe for
disaster. Not only because of the implementation complexities but I
think it makes life very hard for the developer and requires a lot of
sophistication.So if I'd implement something I'd definitely do true multi-threading
with message passing (we basically already have the infrastructure with
TSRM to support that). Do I think this is a high priority item? Not
really but I can understand that it could add value to some. I think for
some Web requests it could actually allow for some parallel processing
with a rendezvous which could help reduce latency (not overall server
throughput). Then again, there'd be some overhead for having a
thread-safe build so I don't see this as something that would be enabled
by the masses - at least not initially.Andi
--
While this direction adds some overhead, I certainly like it's advantages
over the others. +1
Eric Lee Stewart
Hi!
I think that if we were ever to implement threading we would be best off
to enable spawning worker threads that have their own context with no
shared data (and therefore no requirement for locking). We could then
have a message passing API between the threads.
No shared data requires quite new development paradigm though. Imagine
you have big application, say, using some Big Framework, that needs to
do some mysql queries in parallel. Now what happens is:
- You can't really just start a thread and run a query whenever you
wish to, because mysql connection is probably defined by configs that BF
is managing, so you'd have to either create a separate "query server"
thread which would keep the connection or open connection anew each time. - You can't use objects or anything more complex than an array between
threads, since sharing objects means sharing classes and methods, which
means sharing data stored there. - You still have problems with libraries keeping global state, and many
do and even don't bother to tell you about it or provide any means to
manage it (e.g. ICU). - The overhead of keeping whole set of globals per thread will still be
there. Which btw means starting a thread would imply whole RINIT
process, along with allocating globals, etc. which may not be as fast as
you'd expect.
I think it may be worth trying, but for that we need some good use-case
and try to see how it would work out with this use-case.
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
On Fri, Apr 2, 2010 at 1:04 AM, Dennis Hotson dennis.hotson@gmail.comwrote:
I use
pcntl_fork()
for writing parallel multi-process applications and it
works pretty well.
Also, you can use shared memory queues to pass messages between processes
(iemsg_get_queue()
).I wrote a little proof of concept library a while ago to demonstrate:
http://github.com/dhotson/PhorkIt's not that complicated... the main part of it is only ~110 lines of
code:
http://github.com/dhotson/Phork/blob/master/classes/Phork/Process.phpRegards,
DennisYes, the pcntl_fork does exists, but there are some things that better with
threading:
- support on windows
- easier inter-process/inter-thread communication/data passing.
- threads are cheaper
Tyrael
I used to play with TSRM days ago and successfully implemented
userland threading support using GNU Pth. It's just a proof of
concept and I did it for fun.
If interested, check out
http://github.com/moriyoshi/php-src/tree/PHP_5_3-threading/ and
read http://github.com/moriyoshi/php-src/blob/PHP_5_3-threading/ext/threading/README
for detail (not much information though).
Also note that the language syntax was extended there so it would
support golang-like message passing.
<?php
function sub($i, $ch) {
for (;;) {
// receive the message from $ch
$a = <- [$ch];
printf("%d: %s\n", $i, $a);
}
}
$ch = thread_message_queue_create();
for ($i = 0; $i < 10; $i++) {
thread_create('sub', $i, $ch);
}
$i = 0;
for (;;) {
// send $i to $ch
[$ch] <- $i++;
usleep(50000);
}
?>
Moriyoshi
Hello PHP folks,
I've seen this discussed previously, and would like to add a few
arguments for the multi-threading side vs. async processing,
which I've seen mentioned as a viable alternative:1. Imagine that from time to time, some background processing takes 1
second of CPU time - w/o multithreading, all your async operations,
like accepting a connection to a socket, aio or others are basically
stalled. So, async is a good approach, but does not work as a magic
wand for all problem spaces. Alternatively, you could fork and then do the
processing, but then the state syncing of the forked background processing
results with the main thread requires a whole new protocol / switching to
interprocess communication, which makes such developments unnecessarily
hard. Threads exist for a reason not only because they sound cool.2. Without thread support, it is not possible to use multi-core processing
paradigms directly from PHP - which means PHP relies on external frameworks for
that feature, which, in a sense, makes it a non-general-purpose language.
It could become a general purpouse tool, if it had proper multi-threading
support built-in.I, personally, considered developing websockets / nanoserv server stack with PHP
and bumped into the multithreading limitation - AFAIK it is the only big feature
separating PHP from the general purpouse languages. Everything else is well
integrated with lots of external libraries/modules giving rise to potential rapid
application development while using it.Cheers and let me know about your thoughts, and potential core implementation
issues regarding developing this language feature.--
Best regards,
speedy mailto:speedy.spam@gmail.com
I used to play with TSRM days ago and successfully implemented
userland threading support using GNU Pth. It's just a proof of
concept and I did it for fun.
So these are share-nothing worker-threads, which can send results to "master-thread" using messages. right?
I am perfectly fine with such approach
some stylistic moments:
- I would use closures instead of callback-functions
- Is extra language construct really needed? function-call would work just fine
Is overhead of starting new thread large?
I used to play with TSRM days ago and successfully implemented
userland threading support using GNU Pth. It's just a proof of
concept and I did it for fun.So these are share-nothing worker-threads, which can send results to "master-thread" using messages. right?
I am perfectly fine with such approach
A new thread can be created within a sub-thread as well. In addition,
messages can be interchanged between any pair of threads.
While it is based on shared-nothing approach, some kinds of resources
are shared across threads besides classes and functions that would
have already been defined before the thread creation.
some stylistic moments:
- I would use closures instead of callback-functions
I was trying hard to make closures work with the extension, but it
wouldn't end up with a success. I guess I can fix it somehow.
- Is extra language construct really needed? function-call would work just fine
I don't quite think so. It was just an experiment, and each extra
syntactic sugar would get converted to a corresponding single function
call (either thread_message_queue_post() or
thread_message_queue_poll() .)
Is overhead of starting new thread large?
The cost is almost the same as when spawning a new runtime instance on
a threaded web server with TSRM enabled. If you'd pass a large data
to the subthread, then the overhead should go large because of the
deep copy.
Moriyoshi
Hello Moriyoshi,
Monday, April 5, 2010, 5:57:38 PM, you wrote:
Is overhead of starting new thread large?
The cost is almost the same as when spawning a new runtime instance on
a threaded web server with TSRM enabled. If you'd pass a large data
to the subthread, then the overhead should go large because of the
deep copy.
Note that could be optimized by implementing copy on write (with
properly placed hooks and shared data) between contexts - would save tons
of memory and speed up the thread creation by an order of magnitude.
Moriyoshi
--
Best regards,
speedy mailto:speedy.spam@gmail.com
Hi,
Anyone core to summarize the thread on the wiki? can just be a couple of sentences and links to news.php.net.
regardsm
Lukas Kahwe Smith
mls@pooteeweet.org
Hello Moriyoshi,
Monday, April 5, 2010, 5:57:38 PM, you wrote:
While it is based on shared-nothing approach, some kinds of resources
are shared across threads besides classes and functions that would
have already been defined before the thread creation.
Maybe it would not be so hard incrementally add thread safe constructs like
thread safe string, thread safe array, etc. which would be shared between
threads, similar to current constructs, just access-serialized via mutexes
/ rwsems, protected with memory barriers for the changes to be visible from all
CPUs.
Then, local storage could be done via normal containers, and global via
thread-safe ones.
I'm not sure how could that be exposed on the language level, though.
--
Best regards,
speedy mailto:speedy.spam@gmail.com