About optimization

15 years ago by mathieu.suen — view source

unread

Hi,

I would like to know why the opcode is not optimized. Even for some very
simple optimization like constant folding.
For exemple:

line # op fetch ext return
operands

60 0 ADD ~0 5, 7
1 ECHO ~0

which is "echo 5+7;"

-- Mathieu Suen

15 years ago by Sebastian Bergmann — view source

unread

Am 13.01.2010 12:18, schrieb mathieu.suen:

I would like to know why the opcode is not optimized.

Because any optimization, even very simple ones, impose a performance
penalty in the default execution model of PHP which does not use a
bytecode cache.

Only when the bytecode is not regenerated for each execution does it
make sense to invest for time for the then one-time compilation.

--
Sebastian Bergmann Co-Founder and Principal Consultant
http://sebastian-bergmann.de/ http://thePHP.cc/

15 years ago by mathieu.suen — view source

unread

Sebastian Bergmann wrote:

Am 13.01.2010 12:18, schrieb mathieu.suen:

I would like to know why the opcode is not optimized.

Because any optimization, even very simple ones, impose a performance
penalty in the default execution model of PHP which does not use a

bytecode cache.

For simple optimization I don't think so. Take the simple example:

function foo()
{
$a = 45;
return $a;
}

Here if you don't optimize you are creating a variable. So you put
pressure on the gc and the memory.
Best would be some benchmark.
By the way why there is no native bytecode cache ?

Only when the bytecode is not regenerated for each execution does it
make sense to invest for time for the then one-time compilation.

Sorry I don't understand what do you mean?

-- Mathieu Suen

15 years ago by Dave Ingram — view source

unread

mathieu.suen wrote:

Sebastian Bergmann wrote:

Am 13.01.2010 12:18, schrieb mathieu.suen:

Because any optimization, even very simple ones, impose a performance
penalty in the default execution model of PHP which does not use a
bytecode cache.

For simple optimization I don't think so. Take the simple example:

function foo()
{
$a = 45;
return $a;
}

Here if you don't optimize you are creating a variable. So you put
pressure on the gc and the memory.
But most of the time, the act of optimising will take longer than just
compiling and running the code, because you have to make decisions about
whether something can be optimised and the best way to do it. As
Sebastian said, it only makes sense to invest that time when you're
going to be reusing the compiler output. Without an opcode cache, PHP
just throws away the results of the compilation, so there are zero
advantages to optimisation.

Best would be some benchmark.
By the way why there is no native bytecode cache ?

Only when the bytecode is not regenerated for each execution does it
make sense to invest for time for the then one-time compilation.

Sorry I don't understand what do you mean?
What Sebastian means is that it would only make sense to optimise if
you're going to cache the output -- otherwise it is simply wasted time
that could be better spent on other things.

Dave

15 years ago by Graham Kelly — view source

unread

Hi,

Optimizations such as 5+7 into 13 really don't get you much. ZEND_ADD (and
other basic opcodes) are not in any way a slow point in a program. And
unfortunately to be able to optimize these you would probably need to put in
an extra pass in the compiler which would probably just slow things down
(unless you have a LOT of these types of additions).

As for the foo() example... This looks very simple however it is actually a
very hard problem that would most likely take far more time and resources to
solve in the compiler then it would to just leave it be. The problem here is
that you need to understand everywhere that $a is assigned a value and its
value is used. The problem becomes very hard in other functions that have
loops and other types of control structures. It really just ends up
becomming a fairly complex mess to solve.

The same can be said about quite a few of the other optimizations you can
think of. On the surface they seem simple (and a few of them actually are)
but most of them are complex... largely do to some of the unique 'features'
of PHP.

In any case, optimization in PHP is not a lost cause. The first thing you
should really do is be using an opcode cache such as APC. Other than that
there are some solutions being worked on. There is Zend Optimizer and there
is pecl/optimizer (which to warn you is probably far from being stable).
There are also a few efforts to compile PHP such as the PHC compiler.

Overall though, more often than not PHP is not the bottleneck of your
program and thus optimization wont get you too much.

Graham Kelly

mathieu.suen wrote:

Sebastian Bergmann wrote:

Am 13.01.2010 12:18, schrieb mathieu.suen:
Because any optimization, even very simple ones, impose a performance
penalty in the default execution model of PHP which does not use a
bytecode cache.

For simple optimization I don't think so. Take the simple example:

function foo()
{
$a = 45;
return $a;
}

Here if you don't optimize you are creating a variable. So you put
pressure on the gc and the memory.

But most of the time, the act of optimising will take longer than just
compiling and running the code, because you have to make decisions about
whether something can be optimised and the best way to do it. As Sebastian
said, it only makes sense to invest that time when you're going to be
reusing the compiler output. Without an opcode cache, PHP just throws away
the results of the compilation, so there are zero advantages to
optimisation.

Best would be some benchmark.

By the way why there is no native bytecode cache ?

Only when the bytecode is not regenerated for each execution does it

make sense to invest for time for the then one-time compilation.

Sorry I don't understand what do you mean?

What Sebastian means is that it would only make sense to optimise if you're
going to cache the output -- otherwise it is simply wasted time that could
be better spent on other things.

Dave

15 years ago by Alain Williams — view source

unread

Hi,

Optimizations such as 5+7 into 13 really don't get you much. ZEND_ADD (and
other basic opcodes) are not in any way a slow point in a program. And
unfortunately to be able to optimize these you would probably need to put in
an extra pass in the compiler which would probably just slow things down
(unless you have a LOT of these types of additions).

As for the foo() example... This looks very simple however it is actually a
very hard problem that would most likely take far more time and resources to
solve in the compiler then it would to just leave it be. The problem here is
that you need to understand everywhere that $a is assigned a value and its
value is used. The problem becomes very hard in other functions that have
loops and other types of control structures. It really just ends up
becomming a fairly complex mess to solve.

The same can be said about quite a few of the other optimizations you can
think of. On the surface they seem simple (and a few of them actually are)
but most of them are complex... largely do to some of the unique 'features'
of PHP.

In any case, optimization in PHP is not a lost cause. The first thing you
should really do is be using an opcode cache such as APC. Other than that

Unfortunately: APC does not work with PHP 5.3 -- I have a site where I would
love to use it but I cannot. I use APC to great effect elsewhere.

Can anyone say when APC will be fixed for PHP 5.3, what about it being
ready for PHP 6 ?

there are some solutions being worked on. There is Zend Optimizer and there
is pecl/optimizer (which to warn you is probably far from being stable).
There are also a few efforts to compile PHP such as the PHC compiler.

Overall though, more often than not PHP is not the bottleneck of your
program and thus optimization wont get you too much.

--
Alain Williams
Linux/GNU Consultant - Mail systems, Web sites, Networking, Programmer, IT Lecturer.
+44 (0) 787 668 0256 http://www.phcomp.co.uk/
Parliament Hill Computers Ltd. Registration Information: http://www.phcomp.co.uk/contact.php
Past chairman of UKUUG: http://www.ukuug.org/
#include <std_disclaimer.h

15 years ago by Rasmus Lerdorf — view source

unread

Alain Williams wrote:

Hi,

Optimizations such as 5+7 into 13 really don't get you much. ZEND_ADD (and
other basic opcodes) are not in any way a slow point in a program. And
unfortunately to be able to optimize these you would probably need to put in
an extra pass in the compiler which would probably just slow things down
(unless you have a LOT of these types of additions).

As for the foo() example... This looks very simple however it is actually a
very hard problem that would most likely take far more time and resources to
solve in the compiler then it would to just leave it be. The problem here is
that you need to understand everywhere that $a is assigned a value and its
value is used. The problem becomes very hard in other functions that have
loops and other types of control structures. It really just ends up
becomming a fairly complex mess to solve.

The same can be said about quite a few of the other optimizations you can
think of. On the surface they seem simple (and a few of them actually are)
but most of them are complex... largely do to some of the unique 'features'
of PHP.

In any case, optimization in PHP is not a lost cause. The first thing you
should really do is be using an opcode cache such as APC. Other than that

Unfortunately: APC does not work with PHP 5.3 -- I have a site where I would
love to use it but I cannot. I use APC to great effect elsewhere.

The svn version works ok with 5.3. Turn off gc though. You shouldn't
be writing code that requires garbage collection anyway if you are
looking for speed.

-Rasmus

15 years ago by Alain Williams — view source

unread

Alain Williams wrote:

Unfortunately: APC does not work with PHP 5.3 -- I have a site where I would
love to use it but I cannot. I use APC to great effect elsewhere.

The svn version works ok with 5.3. Turn off gc though. You shouldn't
be writing code that requires garbage collection anyway if you are
looking for speed.

Thanks. That compiles nicely and seems to work - CentOS 5.4 - tested on both 32 & 64 bit

gc ? I cannot see anything in php.ini about that -- other than session garbage collection,
I assume that you don't mean that ?

Much of my motivation for APC is using large programs & class libraries (eg smarty, media wiki)
that take a huge amount of time to compile - but the execution path is only a small faction
of the code.

BTW: 'make test' fails horribly because the modules/ directory doesn't contain
most of the modules used (it only contains apc.so). Is there a way of having
''extension_dir'' as a '':'' separated PATH so that more than one can be listed ?

15 years ago by Rasmus Lerdorf — view source

unread

Alain Williams wrote:

Alain Williams wrote:

Unfortunately: APC does not work with PHP 5.3 -- I have a site where I would
love to use it but I cannot. I use APC to great effect elsewhere.
The svn version works ok with 5.3. Turn off gc though. You shouldn't
be writing code that requires garbage collection anyway if you are
looking for speed.

Thanks. That compiles nicely and seems to work - CentOS 5.4 - tested on both 32 & 64 bit

gc ? I cannot see anything in php.ini about that -- other than session garbage collection,
I assume that you don't mean that ?

zend.enable_gc = Off

15 years ago by Karsten Dambekalns — view source

unread

Hi.

Alain Williams wrote:

Unfortunately: APC does not work with PHP 5.3 -- I have a site where I would
love to use it but I cannot. I use APC to great effect elsewhere.

Hm. I have 5.3.1 with APC 3.1.3p1 and it runs fine. This is not a
production environment, but I have not yet had the impression APC was
broken.

What is it that doesn't work yor you?

The svn version works ok with 5.3. Turn off gc though.

Why is that advisable? Any pointers to background information welcome.

Regards,
Karsten

15 years ago by Rasmus Lerdorf — view source

unread

Karsten Dambekalns wrote:

Why is that advisable? Any pointers to background information welcome.

The gc code when combined with apc is still a bit shaky in 5.3. I
haven't figured out why yet. And my motivation for figuring it out is
pretty low as code that relies on gc is slow.

-Rasmus

15 years ago by Stan Vassilev — view source

unread

The gc code when combined with apc is still a bit shaky in 5.3. I
haven't figured out why yet. And my motivation for figuring it out is
pretty low as code that relies on gc is slow.

-Rasmus

Motivation for relying on GC in 5.3 is pretty low because 5.3 is still a bit
shaky...

15 years ago by Tim Starling — view source

unread

Graham Kelly wrote:

Overall though, more often than not PHP is not the bottleneck of your
program and thus optimization wont get you too much.

In a lot of ways, PHP is already well-optimised. The hash tables are
fast, the executor is decent, as executors for weakly-typed languages
go. Many internal functions have quite reasonable C implementations.

Given this, sometimes it's easy to forget that PHP is pathologically
memory hungry, to the point of making simple tasks difficult or
impossible to perform in limited environments. It's the worst language
I've ever encountered in this respect. An array of small strings will
use on the order of 200 bytes per element. An array of integers will use
not much less. A simple object (due to being based on the same
inefficient data structure) may use a kilobyte or two.

Despite the large amount of time I've spent optimising MediaWiki for
memory usage, it still can't run reliably with memory_limit set less
than about 80MB. That means you need a server with 500MB if you want to
set MaxClients high enough to let a few people use it at the same time.

So if it were my job to set priorities for PHP development, I'd spend
less time thinking about folding constants and more time thinking about
things like:

Objects that can optionally pack themselves into a class-dependent
structure and unpack on demand
Exposing strongly-typed list and vector data structures to the user,
that don't have massive hashtable overheads
An oparray format with less 64-bit pointers and more smallish integers

That sort of thing.

-- Tim Starling

15 years ago by Rasmus Lerdorf — view source

unread

Tim Starling wrote:

Given this, sometimes it's easy to forget that PHP is pathologically
memory hungry, to the point of making simple tasks difficult or
impossible to perform in limited environments. It's the worst language
I've ever encountered in this respect. An array of small strings will
use on the order of 200 bytes per element. An array of integers will use
not much less. A simple object (due to being based on the same
inefficient data structure) may use a kilobyte or two.

A zval is around 64 bytes. So, to use 200 bytes per string element,
each of your strings must be around 136 chars long.

For me, working in super high-load environments, this was never an issue
because memory was always way more plentiful than cpu. You can only
slice a cpu in so many slices. Even if you could run 1024 concurrent
Apache/PHP processes, you wouldn't want to unless you could somehow
shove 64 cpus into your machine. For high-performance high-load
environments you want to get each request serviced as fast as possible
and attempting to handle too many concurrent requests works against you
here.

-Rasmus

15 years ago by Tim Starling — view source

unread

Rasmus Lerdorf wrote:

Tim Starling wrote:

Given this, sometimes it's easy to forget that PHP is pathologically
memory hungry, to the point of making simple tasks difficult or
impossible to perform in limited environments. It's the worst language
I've ever encountered in this respect. An array of small strings will
use on the order of 200 bytes per element. An array of integers will use
not much less. A simple object (due to being based on the same
inefficient data structure) may use a kilobyte or two.

A zval is around 64 bytes. So, to use 200 bytes per string element,
each of your strings must be around 136 chars long.

<?php
$m = memory_get_usage();
$a = explode(',', str_repeat(',', 100000));
print (memory_get_usage() - $m)/100000;
?>

I get 197 on 32-bit and 259 on 64-bit. Try it for yourself if you don't
believe me. I've cross-checked memory_get_usage() against "ps -o rss",
it's pretty accurate.

For me, working in super high-load environments, this was never an issue
because memory was always way more plentiful than cpu. You can only
slice a cpu in so many slices. Even if you could run 1024 concurrent
Apache/PHP processes, you wouldn't want to unless you could somehow
shove 64 cpus into your machine. For high-performance high-load
environments you want to get each request serviced as fast as possible
and attempting to handle too many concurrent requests works against you
here.

Maybe the tasks you do are usually with small data sets.

-- Tim Starling

15 years ago by Stanislav Malyshev — view source

unread

Hi!

<?php
$m = memory_get_usage();
$a = explode(',', str_repeat(',', 100000));
print (memory_get_usage() - $m)/100000;

Says 93.2482 for me. Should be even less since string generated by
str_repead itself also is counted as overhead (without it it's 92.2474).
Aren't you perchance using debug build? Debug build gives 196 for me.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

15 years ago by Stanislav Malyshev — view source

unread

Hi!

Says 93.2482 for me. Should be even less since string generated by

On 64-bit I get about 170 bytes for 5.2, don't have 5.3 build handy on
64-bit.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

15 years ago by Derick Rethans — view source

unread

Says 93.2482 for me. Should be even less since string generated by

On 64-bit I get about 170 bytes for 5.2, don't have 5.3 build handy on 64-bit.

On 64bit (debug builds):

derick@kossu:~$ pe 5.3dev
derick@kossu:~$ php
<?php
$m = memory_get_usage();
$a = explode(',', str_repeat(',', 100000));
print (memory_get_usage() - $m)/100000;
?>

378.54448

derick@kossu:~$ pe 5.2dev
derick@kossu:~$ php
<?php
$m = memory_get_usage();
$a = explode(',', str_repeat(',', 100000));
print (memory_get_usage() - $m)/100000;
?>

370.57952

with kind regards,
Derick

http://derickrethans.nl | http://xdebug.org
twitter: @derickr

15 years ago by Rasmus Lerdorf — view source

unread

Stanislav Malyshev wrote:

Hi!

Says 93.2482 for me. Should be even less since string generated by

On 64-bit I get about 170 bytes for 5.2, don't have 5.3 build handy on
64-bit.

178.4972 5.3 non-debug 64-bit Linux

-Rasmus

15 years ago by Tim Starling — view source

unread

Stanislav Malyshev wrote:

Hi!

<?php
$m = memory_get_usage();
$a = explode(',', str_repeat(',', 100000));
print (memory_get_usage() - $m)/100000;

Says 93.2482 for me. Should be even less since string generated by
str_repead itself also is counted as overhead (without it it's
92.2474). Aren't you perchance using debug build? Debug build gives
196 for me.

Yes, it was debug on 32-bit, but non-debug on 64-bit. So non-debug
memory usage on 64-bit is still 259 bytes per element. On 64-bit I am
using PHP 5.2.4-2ubuntu5.7wm1 from apt.wikimedia.org.

In another post:

HashTable uses 40 bytes, zval is 16 bytes, Bucket is 36 bytes, which
means if you use integer indexes, the overhead is 72 bytes per value
including memory block headers and alignments. It might be too much
for you, in which case I'd go towards making an extension that creates
an object storing strings more efficiently and implementing either
get/set handlers or ArrayAccess (or both). This of course would be
most useful if you access only small part of strings in each
function/method.

Fair enough, but we do have to support default installations. We do
already have a couple of optional extensions which reduce memory usage,
but they do more specific tasks than that.

I do not see what could be removed from Bucket or zval without hurting
the functionality.

Right, and that's why PHP is so bad compared to other languages. Its
one-size-fits-all data structure has to store a lot of data per element
to support every possible use case. However, there is room for
optimisation. For instance, an array could start off as being like a C++
std::vector. Then when someone inserts an item into it with a
non-integer key, it could be converted to a hashtable. This could
potentially give you a time saving as well, because conversion to a
hashtable could resize the destination hashtable in one step instead of
growing it O(log N) times.

Some other operations, like deleting items from the middle of the array
or adding items past the end (leaving gaps) would also have to trigger
conversion. The point would be to optimise the most common use cases for
integer-indexed arrays.

not much less. A simple object (due to being based on the same
inefficient data structure) may use a kilobyte or two.

Kilobyte looks like too much for a single simple object (unless we
have different notions of simple). Could you describe what exactly
makes up the kilobyte - what's in the object?

<?php
class C {
var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10;
}

$m = memory_get_usage();
$a = array();
for ( $i = 0; $i < 10000; $i++ ) {
$a[] = new C;
}
print ((memory_get_usage() - $m) / 10000) . "\n";
?>

1927 bytes (I'll use 64-bit from now on since it gives the most shocking
numbers)

Objects that can optionally pack themselves into a class-dependent
structure and unpack on demand

Objects can do pretty much anything in Zend Engine now, provided you
do some C :) For the engine, object is basically a pointer and an
integer, the rest is changeable. Of course, on PHP level we need to
have more, but that's because certain things just not doable on PHP
level. Do you have some specific use case that would allow to reduce

Basically I'm thinking along the same lines as the array optimisation I
suggested above. For my class C in the test above, the zend_class_entry
would have a hashtable like:

v1 => 0, v2 => 1, v3 => 2, v4 => 3, v5 => 4, v6 => 5, v7 => 6, v8 =>7,
v9 => 8, v10 => 9

Then the object could be stored as a zval[10]. Object member access
would be implemented by looking up the member name in the class entry
hashtable and then using the resulting index into the zval[10]. When the
object is unpacked (say if the user creates or deletes object members at
runtime), then the object value becomes a hashtable.

Exposing strongly-typed list and vector data structures to the user,
that don't have massive hashtable overheads

An oparray format with less 64-bit pointers and more smallish integers

Ah, you're on 64-bit... That explains why your memory requirements is
larger :) But I'm not sure how the data op array needs can be stored
without using pointers.

Making oplines use a variable amount of memory (like they do in machine
code) would be a great help.

For declarations, you could pack structures like zend_class_entry and
zend_function_entry on to the end of the opline, and access them by
casting the opline to the appropriate opcode-specific type. That would
save pointers and also allocator overhead.

At the more extreme end of the spectrum, the compiler could produce a
pointerless oparray, like JVM bytecode. Then when a function is executed
for the first time, the oparray could be expanded, with pointers added,
and the result cached. This would reduce memory usage for code which is
never executed. And it would have the added advantage of making APC
easier to implement, since it could just copy the whole unexpanded
oparray with memcpy().

-- Tim Starling

15 years ago by Rasmus Lerdorf — view source

unread

Tim Starling wrote:

Some other operations, like deleting items from the middle of the array
or adding items past the end (leaving gaps) would also have to trigger
conversion. The point would be to optimise the most common use cases for
integer-indexed arrays.

I still say this isn't something most people run into. I have looked at
a lot of code in a lot of different use cases and I always see things
being cpu-bound long before it is memory-bound.

-Rasmus

15 years ago by Rasmus Lerdorf — view source

unread

Tim Starling wrote:

<?php
class C {
var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10;
}

$m = memory_get_usage();
$a = array();
for ( $i = 0; $i < 10000; $i++ ) {
$a[] = new C;
}
print ((memory_get_usage() - $m) / 10000) . "\n";
?>

1927 bytes (I'll use 64-bit from now on since it gives the most shocking
numbers)

PHP 5.3.3-dev (cli) (built: Jan 11 2010 11:26:25)
Linux colo 2.6.31-1-amd64 #1 SMP Sat Oct 24 17:50:31 UTC 2009 x86_64

php > class C {
php { var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10;
php { }
php >
php > $m = memory_get_usage();
php > $a = array();
php > for ( $i = 0; $i < 10000; $i++ ) {
php { $a[] = new C;
php { }
php > print ((memory_get_usage() - $m) / 10000) . "\n";
1479.5632

So you need 1500 bytes per object in your array. I still fail to see
the problem for a web request. Maybe I am just old-fashioned in the way
I look at this stuff, but if you have more than 1000 objects loaded on a
single request, you are doing something wrong as far as I am concerned.

This is why we do things like unbuffered mysql queries, zero-copy stream
passing, etc. We never want entire result sets or entire files in
memory because even if we optimize the crap out of it, it is still going
to be way faster to simply not do that.

-Rasmus

15 years ago by Andrey Hristov — view source

unread

Rasmus Lerdorf wrote:

Tim Starling wrote:

<?php
class C {
var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10;
}

$m = memory_get_usage();
$a = array();
for ( $i = 0; $i < 10000; $i++ ) {
$a[] = new C;
}
print ((memory_get_usage() - $m) / 10000) . "\n";
?>

1927 bytes (I'll use 64-bit from now on since it gives the most shocking
numbers)

PHP 5.3.3-dev (cli) (built: Jan 11 2010 11:26:25)
Linux colo 2.6.31-1-amd64 #1 SMP Sat Oct 24 17:50:31 UTC 2009 x86_64

php > class C {
php { var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10;
php { }
php >
php > $m = memory_get_usage();
php > $a = array();
php > for ( $i = 0; $i < 10000; $i++ ) {
php { $a[] = new C;
php { }
php > print ((memory_get_usage() - $m) / 10000) . "\n";
1479.5632

So you need 1500 bytes per object in your array. I still fail to see
the problem for a web request. Maybe I am just old-fashioned in the way
I look at this stuff, but if you have more than 1000 objects loaded on a
single request, you are doing something wrong as far as I am concerned.

This is why we do things like unbuffered mysql queries, zero-copy stream
passing, etc. We never want entire result sets or entire files in
memory because even if we optimize the crap out of it, it is still going
to be way faster to simply not do that.

actually with mysqlnd a buffered set might be faster, if you know what
are you doing, because the data won't be copied once more. with
unbuffered sets data is copied from the network buffer to the zval. With
buffered sets the zval just pointes to the network buffer. If you have
the RAM then buffered should be faster. Of course you should use the set
and when finished close it and not fetch-close-process, because then
copy is forced.

Best,
Andrey

15 years ago by Stanislav Malyshev — view source

unread

Hi!

class C {
var $v1, $v2, $v3, $v4, $v5, $v6, $v7, $v8, $v9, $v10;
}

$m = memory_get_usage();
$a = array();
for ( $i = 0; $i< 10000; $i++ ) {
$a[] = new C;
}
print ((memory_get_usage() - $m) / 10000) . "\n";
?>

1927 bytes (I'll use 64-bit from now on since it gives the most shocking
numbers)

OK, you have object with 10 vars - as we established, vars in array take
100-200 bytes overhead (depending on bits - 64bits is fatter) so it fits
the pattern.

Then the object could be stored as a zval[10]. Object member access
would be implemented by looking up the member name in the class entry
hashtable and then using the resulting index into the zval[10]. When the
object is unpacked (say if the user creates or deletes object members at
runtime), then the object value becomes a hashtable.

That would mean having 2 object types - "packed" and "unpacked" with all
(most of) operations basically duplicated. However, for objects it's
easier than for arrays since objects API is more abstract. I'm not sure
that would improve situation though - a lot of objects are dynamic and
for those it would mean a penalty when the object is unpacked.
But this can be tested on the current engine (maybe even without
breaking BC!) and if it gives good results it may be an option.

Making oplines use a variable amount of memory (like they do in machine
code) would be a great help.

For declarations, you could pack structures like zend_class_entry and
zend_function_entry on to the end of the opline, and access them by
casting the opline to the appropriate opcode-specific type. That would
save pointers and also allocator overhead.

zend_class_entry is huge, why would you want to put it into the opline?
And what opline needs static zend_class_entry anyway?

At the more extreme end of the spectrum, the compiler could produce a
pointerless oparray, like JVM bytecode. Then when a function is executed
for the first time, the oparray could be expanded, with pointers added,
and the result cached. This would reduce memory usage for code which is

opcodes can be cached (bytecode caches do it) but op_array can't really
be cached between requests because it contains dynamic structures.
Unlike Java, PHP does full cleanup after each request, which means no
preserving dynamic data.

I'm not sure how using pointers in op_array in such manner would help
though - you'd still need to store things like function names, for
example, and since you need to store it somewhere, you'd also have some
pointer to this place. Same goes for a bunch of other op_array's
properties - you'd need to store them somewhere and be able to find
them, so I don't see how you'd do it without a pointer of some kind
involved.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

15 years ago by Tim Starling — view source

unread

Stanislav Malyshev wrote:

opcodes can be cached (bytecode caches do it) but op_array can't
really be cached between requests because it contains dynamic
structures. Unlike Java, PHP does full cleanup after each request,
which means no preserving dynamic data.

APC deep-copies the whole zend_op_array, see apc_copy_op_array() in
apc_compile.c. It does it using an impressive pile of hacks which break
with every major release and in some minor releases too. Every time the
compiler allocates memory, there has to be a matching shared memory
allocation in APC.

But maybe you missed my point. I'm talking about a cache which is cheap
to construct and cleared at the end of each request. It would optimise
tight loops of calls to user-defined functions. The dynamic data, like
static variable hashtables, would be in it. The compact pointerless
structure could be stored between requests, and would not contain
dynamic data.

Basically a structure like the current zend_op_array would be created on
demand by the executor instead of in advance by the compiler.

I'm not sure how using pointers in op_array in such manner would help
though - you'd still need to store things like function names, for
example, and since you need to store it somewhere, you'd also have
some pointer to this place.

You can do it with a length field and a char[1] at the end of the
structure. When you allocate memory for the structure, you add some on
for the string. Then you copy the string into the char[1], overflowing it.

If you need several strings, then you can have several byte offsets,
which are added to the start of the char[1] to find the location of the
string in question. You can make the offset fields small, say 16 bits.

But it's mostly zend_op I'm interested in rather than zend_op_array.
Currently if a zend_op has a string literal argument, you'd make a zval
for it and copy it into op1.u.constant. But the zval allocation could be
avoided. The handler could cast the zend_op to a zend_op_with_a_string,
which would have a length field and an overflowed char[1] at the end for
the string argument.

A variable op size would make iterating through zend_op_array.opcodes
would be slightly more awkward, something like:

for (; op < oparray_end; op = (zend_op*)((char*)op + op->size)) {
...

But obviously you could clean that up with a macro.

For Mr. "everyone has 8GB of memory and tiny little data sets" Lerdorf,
I could point out that reducing the average zend_op size and placing
strings close to other op data will also make execution faster, due to
the improved CPU cache hit rate.

-- Tim Starling

15 years ago by Rasmus Lerdorf — view source

unread

Tim Starling wrote:

For Mr. "everyone has 8GB of memory and tiny little data sets" Lerdorf,
I could point out that reducing the average zend_op size and placing
strings close to other op data will also make execution faster, due to
the improved CPU cache hit rate.

Nice twist there. I simply related memory to cpu and the assumption was
that if you had a dual quad-core system, chances are that you also had
8G of ram. Having 8 cores with only 1G of ram would be a weird server
config.

-Rasmus

15 years ago by Stanislav Malyshev — view source

unread

Hi!

Basically a structure like the current zend_op_array would be created on
demand by the executor instead of in advance by the compiler.

I guess we could have strings, etc. put in one big string buffer and
refer to them by 32-bit index, that would probably work with statically
allocated things (like filenames, etc.) But that'd only be useful in
64-bit case, and would just slow down 32-bit (since we probably couldn't
afford having 16-bit indexes), which means we should either have
separate code for 32 and 64 or have a ton of macros for each string
access. I'm not sure that is worth the trouble.

We could do something that could improve things somewhat - namely,
organize all static strings into per-op-array string table (in op_array
and znode zvals) and refer to them by index. That also would give us
some advantages since we could precalculate hashes. IIRC Dmitry Stogov
had done some research on that.

You can do it with a length field and a char[1] at the end of the
structure. When you allocate memory for the structure, you add some on
for the string. Then you copy the string into the char[1], overflowing it.

If you need several strings, then you can have several byte offsets,
which are added to the start of the char[1] to find the location of the
string in question. You can make the offset fields small, say 16 bits.

It's definitely be too much trouble to work with such structures, will
lead to a ton of bugs and it'd be a nightmare to manage...

But it's mostly zend_op I'm interested in rather than zend_op_array.
Currently if a zend_op has a string literal argument, you'd make a zval
for it and copy it into op1.u.constant. But the zval allocation could be

No, zval is part of znode. There might be an allocation on compile
stage, etc. but it's temporary - the zval itself is stored inside znode,
not allocated elsewhere. See zend_compile.h

avoided. The handler could cast the zend_op to a zend_op_with_a_string,
which would have a length field and an overflowed char[1] at the end for
the string argument.

Since we need to address zend_op's inside array, variable size ops would
be a major inconvenience. Also, since zval is an union, I'm not even
sure you'll be saving that much. Constant table though might allow some
savings, but would complicate opcodes somewhat.

A variable op size would make iterating through zend_op_array.opcodes
would be slightly more awkward, something like:

Note that we need not just iterating but also random access (and no, not
only for goto :) - many constructs are compiled into code including jumps).

BTW, as for more effective vars storage - did you look at SPL types,
especially SplFixedArray? It looks like exactly what you want with
fixed-size storage.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

15 years ago by Andi Gutmans — view source

unread

-----Original Message-----
From: Tim Starling [mailto:tstarling@wikimedia.org]
Sent: Wednesday, January 13, 2010 7:19 PM
To: Stas Malyshev
Cc: internals@lists.php.net
Subject: Re: [PHP-DEV] About optimization

Stanislav Malyshev wrote:

opcodes can be cached (bytecode caches do it) but op_array can't
really be cached between requests because it contains dynamic
structures. Unlike Java, PHP does full cleanup after each request,
which means no preserving dynamic data.

APC deep-copies the whole zend_op_array, see apc_copy_op_array() in
apc_compile.c. It does it using an impressive pile of hacks which
break with
every major release and in some minor releases too. Every time the
compiler
allocates memory, there has to be a matching shared memory allocation
in
APC.

But maybe you missed my point. I'm talking about a cache which is
cheap to
construct and cleared at the end of each request. It would optimise
tight loops
of calls to user-defined functions. The dynamic data, like static
variable
hashtables, would be in it. The compact pointerless structure could be
stored
between requests, and would not contain dynamic data.

Basically a structure like the current zend_op_array would be created
on
demand by the executor instead of in advance by the compiler.

I'm not sure how using pointers in op_array in such manner would
help
though - you'd still need to store things like function names, for
example, and since you need to store it somewhere, you'd also have
some pointer to this place.

You can do it with a length field and a char[1] at the end of the
structure. When
you allocate memory for the structure, you add some on for the string.
Then
you copy the string into the char[1], overflowing it.

If you need several strings, then you can have several byte offsets,
which are
added to the start of the char[1] to find the location of the string
in question.
You can make the offset fields small, say 16 bits.

But it's mostly zend_op I'm interested in rather than zend_op_array.
Currently if a zend_op has a string literal argument, you'd make a
zval for it
and copy it into op1.u.constant. But the zval allocation could be
avoided. The
handler could cast the zend_op to a zend_op_with_a_string, which would
have
a length field and an overflowed char[1] at the end for the string
argument.

I tried the char[1] trick in the past. I can't quite remember why I
passed on it but I think because it now changed the sizes from zval from
being fixed and therefore couldn't efficiently cache zval allocations in
the memory manager (and of course this does not work with zend_opline
like structures where we have more than one zend_op(zval) in the
structure.

Andi

15 years ago by Rasmus Lerdorf — view source

unread

Tim Starling wrote:

Rasmus Lerdorf wrote:

For me, working in super high-load environments, this was never an issue
because memory was always way more plentiful than cpu. You can only
slice a cpu in so many slices. Even if you could run 1024 concurrent
Apache/PHP processes, you wouldn't want to unless you could somehow
shove 64 cpus into your machine. For high-performance high-load
environments you want to get each request serviced as fast as possible
and attempting to handle too many concurrent requests works against you
here.

Maybe the tasks you do are usually with small data sets.

Well, I was referring to Yahoo-sized stuff. So no, the datasets are
rather huge, but on a per-request basis you want to architect things so
you only load things you actually need on that one request.

If you really do need to play around with hundreds of thousands of
records of anything in memory on a single request, then you should
definitely be looking at writing an extension and doing that in a custom
data type streamlined for that particular type of data.

Keeping your Apache2 processes around 40M or below even for less than
efficient code was never much of a problem and that means you can do
about 50 processes in 2G of memory. You probably don't want to go much
beyond 50 concurrent requests on a single quad-core cpu since there just
won't be enough juice for each one to finish in a timely manner. Dual
quad-core and you can probably go to about 100, but you also tend to
have more ram in those. You can of course crank up the concurrency if
you are willing to take the latency hit.

For my own stuff that doesn't use any heavy framework code I easily keep
my per-Apache incremental memory usage under 10M.

-Rasmus

15 years ago by Stanislav Malyshev — view source

unread

Hi!

Given this, sometimes it's easy to forget that PHP is pathologically
memory hungry, to the point of making simple tasks difficult or
impossible to perform in limited environments. It's the worst language
I've ever encountered in this respect. An array of small strings will
use on the order of 200 bytes per element. An array of integers will use

HashTable uses 40 bytes, zval is 16 bytes, Bucket is 36 bytes, which
means if you use integer indexes, the overhead is 72 bytes per value
including memory block headers and alignments. It might be too much for
you, in which case I'd go towards making an extension that creates an
object storing strings more efficiently and implementing either get/set
handlers or ArrayAccess (or both). This of course would be most useful
if you access only small part of strings in each function/method.

I do not see what could be removed from Bucket or zval without hurting
the functionality.

not much less. A simple object (due to being based on the same
inefficient data structure) may use a kilobyte or two.

Kilobyte looks like too much for a single simple object (unless we have
different notions of simple). Could you describe what exactly makes up
the kilobyte - what's in the object?

Objects that can optionally pack themselves into a class-dependent
structure and unpack on demand

Objects can do pretty much anything in Zend Engine now, provided you do
some C :) For the engine, object is basically a pointer and an integer,
the rest is changeable. Of course, on PHP level we need to have more,
but that's because certain things just not doable on PHP level. Do you
have some specific use case that would allow to reduce

Exposing strongly-typed list and vector data structures to the user,
that don't have massive hashtable overheads

An oparray format with less 64-bit pointers and more smallish integers

Ah, you're on 64-bit... That explains why your memory requirements is
larger :) But I'm not sure how the data op array needs can be stored
without using pointers.

Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com

15 years ago by Derick Rethans — view source

unread

Exposing strongly-typed list and vector data structures to the user,
that don't have massive hashtable overheads

I'm actually working on a few things here.. some more efficient sets and
hashes. Expect more to see soon.

regards,
Derick

--
http://derickrethans.nl | http://xdebug.org
twitter: @derickr

15 years ago by steve — view source

unread

Having 8 cores with only 1G of ram would be a weird server config.

A single socket quad-core with hyper-threading and 2GB RAM for a
32-bit webserver is not weird. Not everyone is Yahoo where you can
just throw money around.

For Mr. "everyone has 8GB of memory and tiny little data sets" Lerdorf,
I could point out that reducing the average zend_op size and placing
strings close to other op data will also make execution faster, due to
the improved CPU cache hit rate.

I don't think PHP has as much support as you think it does. There is
no big supporter to fund a real development drive like that. I doubt
anyone does I1/D1/L2 cache profiling for PHP. PHP doesn't ship with an
optimizer, byte code cache, or JIT. Rasmus had the idea that it should
do simple things and be easy, and if you were going to do anything
else, then you should have the money to do so. Fair enough really.

Kinda reminds me of Firefox and its speed and memory usage. Machines
have lots of RAM, so lets use it, blah, blah, blah. When they finally
decided to do mobile phones, they realized what a mess their startup
times were, what a hog it was with memory, etc. They even do some
cache profiling now! I'll never use Fennec, but happily use the
results in Firefox. You should see the careful work that went into
Google's V8.

But most of the time, the act of optimising will take longer than just
compiling and running the code, because you have to make decisions about
whether something can be optimised and the best way to do it.

Yeah, like no one should cache or JIT a web page. No loops. The use
pattern is to just look at a web page once and then the next. I think
Microsoft still says that. (Admittedly, Microsoft ships its own opcode
cache for PHP.)

iamstever

15 years ago by Rasmus Lerdorf — view source

unread

steve wrote:

Having 8 cores with only 1G of ram would be a weird server config.

A single socket quad-core with hyper-threading and 2GB RAM for a
32-bit webserver is not weird. Not everyone is Yahoo where you can
just throw money around.

Hyperthreading doesn't come anywhere near making a quad-core cpu run
like an 8-core box. If you can get 25% more speed out of
hyperthreading, you are doing extremely well, and that is only if the
code has a lot of cache misses allowing one hyperthread to run while the
other is waiting for a memory fetch. So, assuming plenty of cache
misses, your scenario is approximately 5 cores and 2G of ram which is
pretty far from the 8/1 ratio I thought was a weird server config.
Hyperthreading can also hurt you, so most large sites I know turn it off
entirely.

-Rasmus

15 years ago by Tim Starling — view source

unread

steve wrote:

I don't think PHP has as much support as you think it does. There is
no big supporter to fund a real development drive like that.

I'd like to think that I've more or less worked out who supports PHP by
now. I know I don't go to many conferences but I haven't been living in
a cave these past 6 years.

I doubt anyone does I1/D1/L2 cache profiling for PHP.

I did a little bit of CPU cache profiling of PHP using oprofile, more
out of curiosity than anything. It was a couple of years ago now.

http://wikitech.wikimedia.org/view/Oprofile

But you don't need oprofile, you can make changes based on theory, and
then measure the execution time of the result.

PHP doesn't ship with an optimizer, byte code cache, or JIT.

But community members are developing those things nonetheless.
eAccelerator has an optimizer, there are several so-called byte code
caches, and Roadsend is a promising compiler project. I think you're
underestimating the PHP community.

Rasmus had the idea that it should
do simple things and be easy, and if you were going to do anything
else, then you should have the money to do so. Fair enough really.

Rasmus is not the whole community. Sometimes community members have
spare time or the freedom to choose what they work on. I put these ideas
out on this list in the hopes that someone might be inspired by them.
Also it's nice to get feedback in case I decide to have a go at one of
them myself.

-- Tim Starling

15 years ago by steve — view source

unread

I doubt anyone does I1/D1/L2 cache profiling for PHP.

I did a little bit of CPU cache profiling of PHP using oprofile, more
out of curiosity than anything. It was a couple of years ago now.

http://wikitech.wikimedia.org/view/Oprofile

But you don't need oprofile, you can make changes based on theory, and
then measure the execution time of the result.

I don't know where to go with that. I so much agree. Yet it so much
depends on what the theory is based on. But measurement and a decent
test matrix is key. valgrind/callgrind and the others can help.

Honestly, I think people should stay out of coding languages (either
interpreters or compilers, where interpreters are the more complex
case often best made easier by doing JIT) unless they have done CPU
design. It is not like the pre-386 days. These days CPUs are designed
for the compilers (either where they are, or likely where they will
be). The CPU designer decides how the compiler should operate. It is
their theories that matter. Not always -- just like in literature --
the author may create something that is beyond their own grasp and
best understood by others.

PHP doesn't ship with an optimizer, byte code cache, or JIT.

But community members are developing those things nonetheless.
eAccelerator has an optimizer, there are several so-called byte code
caches, and Roadsend is a promising compiler project.

Have been developing for a more than a decade... PHP4 was the last
time there was real performance improvements in PHP itself. The fact
that there are "several so-called byte code caches" does not equal a
good thing. It means that PHP is broken and lots of people are trying
to fix it. It also means that none have succeeded, as that would mean
that one of them would be included in the PHP core by now.

Rasmus had the idea that it should
do simple things and be easy, and if you were going to do anything
else, then you should have the money to do so. Fair enough really.

Rasmus is not the whole community.

Like a founder of a company that sets the corporate culture, don't
count Rasmus out so easily. Founders earn such power. Until they are
booted. It is not going to, nor should it, happen here.

The guys at Zend muscled in to change the culture as well, and have
succeeded to a large degree, pushing PHP into the enterprise by
offering a "full" version of PHP, not free of course. And thus the
reason for not having a byte code cache in the core. And the whole
"optimizer" which was their decoder part of their encoder project was
just making bad karma. Enough time has passed for a new round to
wrestle control. We'll see how the FBJIT goes. Which just goes to
show, if you really want something done, put some muscle into, take
over or fork. Or keep to yourself.

15 years ago by Rasmus Lerdorf — view source

unread

I think some of this discussion has been from very different interesting
angles. Let me explain how I see and use PHP.

PHP is the frontend of your backend. It is not your backend in any
sizable system. By that I mean that PHP is not the place to play around
with large data sets. Databases, Cassandra, Hadoop, memcache, etc.
serve that purpose. If you need functionality on top of those backend
systems that manipulate millions or even thousands of rows on an
individual request, then you need to build it in something other than
PHP. It also isn't for forking background processes, hence we have
technologies like Gearman. PHP punts the hard stuff to technologies
designed to handle those tasks and instead focuses on the glue layers.

In the early days of PHP (1994-1996) I saw PHP as primarily a C API for
extending the web server without needing to know the internal workings
of the web server. The macro-templating language was a cute feature
that let you expose the functionality you built in C as a set of
template tags.

The Web grew so fast and was initially mostly ignored by lower-level C
developers so the people tasked with building web sites didn't have the
background to write C code against the PHP API which caused the focus to
shift away from the API to bunch of canned tags that people commonly
needed/requested.

This hasn't changed that much over the years. The templating language
has matured quite a bit and is now powerful enough to write extremely
complicated things in it. But you still shouldn't write a database in
PHP.

This isn't about server costs. It is about choosing the right tool for
the right part of the job. A Javascript library for the client-side
frontend, PHP for the server-side frontend, C/C++ for your middle-layer
and an appropriate datastore behind it all and you can build amazing
things with PHP. The largest destinations on the Web today are written
exactly like this.

This doesn't mean we shouldn't try to optimize PHP, and you will note
that APC is scheduled to be included in PHP 6, but there is always going
to be significant overhead incurred by a scripting language. PHP 6
needs more room to store strings, for example, because we live in a
Unicode world. And yes, there are obviously ways to reduce the overhead
with custom datatypes, but it makes things more complicated because, as
I said, PHP is glue. By having a single datatype that all the
extensions understand, everything can talk to everything. Once you
start moving away from the single zval approach towards different
datatypes for different purposes, you have to retrofit all existing
extensions to teach them how to treat these new datatypes and it makes
the already too-complicated extension API even more complicated which
would hurt the glue aspect of PHP.

-Rasmus

15 years ago by Tim Starling — view source

unread

steve wrote:

Like a founder of a company that sets the corporate culture, don't
count Rasmus out so easily. Founders earn such power. Until they are
booted. It is not going to, nor should it, happen here.

I don't think PHP has an Ulrich Drepper or a Linus Torvalds. When I read
this list, I see Rasmus arguing on equal terms with other developers, he
doesn't arrogantly pull rank. I think that's good, because Rasmus is
very conservative, and I think PHP has a lot of potential that he
doesn't see.

Rasmus Lerdorf wrote:

This isn't about server costs. It is about choosing the right tool for
the right part of the job. A Javascript library for the client-side
frontend, PHP for the server-side frontend, C/C++ for your middle-layer
and an appropriate datastore behind it all and you can build amazing
things with PHP. The largest destinations on the Web today are written
exactly like this.

That's not the world I live in. I work on a pure-PHP application which
is widely used on servers where the installing user does not have the
ability to change their php.ini or to install extensions or middleware.
The same application (with a few small extensions in C/C++) is used to
run one of the largest destinations on the Web. It all works just fine,
and you sell PHP short when you imply that it can't do this. We're not
going to fork MediaWiki just because you think it can't be done: it can
be done and we're doing it.

It all works beautifully: we have volunteers from the Wikimedia side,
and volunteers from the external installation side, and they work
together to develop features that are usable by both.

The small amount of money Wikimedia has comes mostly from individual
donors interested in seeing Wikipedia continue. It would be imprudent to
spend it all on software development without at least trying to attract
volunteers who, instead of donating money, can donate their time.

And yes, there are obviously ways to reduce the overhead
with custom datatypes, but it makes things more complicated because, as
I said, PHP is glue. By having a single datatype that all the
extensions understand, everything can talk to everything. Once you
start moving away from the single zval approach towards different
datatypes for different purposes, you have to retrofit all existing
extensions to teach them how to treat these new datatypes and it makes
the already too-complicated extension API even more complicated which
would hurt the glue aspect of PHP.

Quite so, but I didn't actually suggest anything which would break
source compatibility with the bulk of extensions.

I suggested having a vector-like mode for hashtables. This could be
implemented while maintaining compatibility with the usual insert, find
and iteration macros and functions. Only extensions which access the
HashTable structure directly, such as ext/standard/array.c, would need
changes.
I suggested having a more compact, variable-length zend_op. There are
very few extensions that access zend_op, just things like reflection,
APC and parsekit.
I suggested compact object storage for objects which have the same set
of member variables as their class declaration. This could probably be
implemented in the default handlers without touching the rest of the code.

The original poster suggested an optimisation pass post-compile, which
obviously doesn't break anything because there's extensions that do it
already. So I don't know who you're arguing against.

-- Tim Starling

15 years ago by Rasmus Lerdorf — view source

unread

Tim Starling wrote:

That's not the world I live in. I work on a pure-PHP application which
is widely used on servers where the installing user does not have the
ability to change their php.ini or to install extensions or middleware.
The same application (with a few small extensions in C/C++) is used to
run one of the largest destinations on the Web. It all works just fine,
and you sell PHP short when you imply that it can't do this. We're not
going to fork MediaWiki just because you think it can't be done: it can
be done and we're doing it.

But aren't the people who have large installs also likely to be running
on something slightly beyond an $10/month shared hosting account? I bet
a mediawiki extension would be quite popular with the dedicated server
users along with all the slicehost/linode folks who splurge and pay
$40/month for their hosting.

The original poster suggested an optimisation pass post-compile, which
obviously doesn't break anything because there's extensions that do it
already. So I don't know who you're arguing against.

I'm not arguing against anything, simply explaining how we got here. I
am all for optimizations that don't break everything. In the case of a
post-compile optimization pass nobody has been able to write one that
could speed up normal code without caching the optimized opcodes. We do
have pecl/optimizer which works in conjunction with APC. It can easily
be made to work without APC, but there isn't much point since the pass
always takes longer than the execution time it can save unless the code
being optimized is absolutely horrendous.

We have also played with some of your other ideas in the past, but I
suppose most of the core devs are somewhat spoiled by not needing to run
an entire Wikipedia clone on a $10 shared hosting account. All I can
say on this is, send some patches to the list. PHP improves through code.

-Rasmus

15 years ago by steve — view source

unread

This isn't about server costs. It is about choosing the right tool for
the right part of the job. A Javascript library for the client-side
frontend, PHP for the server-side frontend, C/C++ for your middle-layer
and an appropriate datastore behind it all and you can build amazing
things with PHP. The largest destinations on the Web today are written
exactly like this.

This is a tremendous insight. No where near my experience. (Neither is
cheap hosting for individuals). Faster PHP means smaller webfarm, and
if you pay for that webfarm, then these things matter. At any rate,
thanks for the long description. And I do notice the nice tone in
contrast to mine that day. Sigh...

All I can say on this is, send some patches to the list. PHP improves through code.

True, true. But I remember a history of push back to such things, and
even if now that is no longer the case, the price of political
engagement is too high (that is, just explaining the stuff, etc).
We're at the point of migrating away (in small tiny steps) anyhow, but
I hope others that have experience and extra manpower speak up. There
are some interesting internal forks of php out there that are cleaner
and faster than what we could contribute anyhow.

It seems that you did not look closely to the improvements made to PHP 5.3.

Sadly, I'm not sure 5.3 is in the cards for this year, and the stock
build wouldn't do. Needs work on method dispatch.

iamstever

15 years ago by Pierre Joye — view source

unread

hi,

But community members are developing those things nonetheless.
eAccelerator has an optimizer, there are several so-called byte code
caches, and Roadsend is a promising compiler project.

Have been developing for a more than a decade... PHP4 was the last
time there was real performance improvements in PHP itself.

It seems that you did not look closely to the improvements made to PHP 5.3.

About PHP being broken and other similar arguments, I'm waiting for
your proposals ('work harder, higher, better" is not a proposal :)

Cheers,

Pierre

@pierrejoye | http://blog.thepimp.net | http://www.libgd.org

15 years ago by Richard Lynch — view source

unread

The guys at Zend muscled in to change the culture as well, and have

I'm not sure that's a fair representation of the historical reality of
how Zend came into existence...

succeeded to a large degree, pushing PHP into the enterprise by
offering a "full" version of PHP, not free of course. And thus the

And that doesn't even make any sense at all.

What Zend product are you claiming is a "full" version of PHP?

And how is the non-Zend PHP "not full"?

reason for not having a byte code cache in the core. And the whole
"optimizer" which was their decoder part of their encoder project was
just making bad karma.

Bundling the decoder into the optimizer may not have been the best
move ever...

Enough time has passed for a new round to
wrestle control. We'll see how the FBJIT goes. Which just goes to
show, if you really want something done, put some muscle into, take
over or fork. Or keep to yourself.

Sounds good. Which are you doing? :-)

--
Some people ask for gifts here.
I just want you to buy an Indie CD for yourself:
http://cdbaby.com/search/from/lynch

15 years ago by steve — view source

unread

The guys at Zend muscled in to change the culture as well, and have

I'm not sure that's a fair representation of the historical reality of
how Zend came into existence...

OK. I don't think it is/was a bad thing. PHP needed that.

And how is the non-Zend PHP "not full"?

Well, one would think that an op-code cache would have made it into
the PHP distribution by now, but such a thing would have competed with
the Zend products, so they weren't.

Enough time has passed for a new round to
wrestle control. We'll see how the FBJIT goes. Which just goes to
show, if you really want something done, put some muscle into, take
over or fork. Or keep to yourself.

Sounds good. Which are you doing? :-)

I was talking about the FaceBook JIT -- FBJIT. I understand they
didn't get quite as much performance out of it as their cross
compiling. For now at least. Two different groups of people anyhow.

About optimization

line # op fetch ext return operands

Says 93.2482 for me. Should be even less since string generated by str_repead itself also is counted as overhead (without it it's 92.2474). Aren't you perchance using debug build? Debug build gives 196 for me.

On 64-bit I get about 170 bytes for 5.2, don't have 5.3 build handy on 64-bit.

with kind regards, Derick

BTW, as for more effective vars storage - did you look at SPL types, especially SplFixedArray? It looks like exactly what you want with fixed-size storage.

Ah, you're on 64-bit... That explains why your memory requirements is larger :) But I'm not sure how the data op array needs can be stored without using pointers.

Cheers,

line # op fetch ext return
operands

Says 93.2482 for me. Should be even less since string generated by
str_repead itself also is counted as overhead (without it it's 92.2474).
Aren't you perchance using debug build? Debug build gives 196 for me.

On 64-bit I get about 170 bytes for 5.2, don't have 5.3 build handy on
64-bit.

with kind regards,
Derick

BTW, as for more effective vars storage - did you look at SPL types,
especially SplFixedArray? It looks like exactly what you want with
fixed-size storage.

Ah, you're on 64-bit... That explains why your memory requirements is
larger :) But I'm not sure how the data op array needs can be stored
without using pointers.