I just realized something that never occurred to me before - every
property is actually stored as a hash.
This test-script will demonstrate:
<?php
define('NUM_TESTS', 1000);
$before = memory_get_usage(true);
$test = array();
class Foo
{
public $aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa;
public $bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb;
public $cccccccccccccccccccccccccccccccccccccccccccccccccccc;
public $dddddddddddddddddddddddddddddddddddddddddddddddddddd;
}
$bytes = 0;
for ($i=0; $i<NUM_TESTS; $i++) {
$foo = new Foo;
$foo->aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa = 'a';
$foo->bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb = 'b';
$foo->cccccccccccccccccccccccccccccccccccccccccccccccccccc = 'c';
$foo->dddddddddddddddddddddddddddddddddddddddddddddddddddd = 'd';
$bytes += 4;
$test[] = $foo;
}
$after = memory_get_usage(true);
header('Content-type: text/plain');
echo ($after-$before).' bytes used; '.$bytes.' bytes of information stored.';
Output is this:
786432 bytes used; 4000 bytes of information stored.
I know this an extreme example, I just did it to see if what I
suspected was actually correct.
How come it's necessary to store the property-names of every property
in every object? For properties that have been defined in classes, why
can't they be stored in a more efficient manner? (using lookup tables)
I know the nature of PHP is dynamic, and I know that dynamic
properties would have to be stored in a key/value form internally...
but if you look at modern PHP software, dynamic properties is actually
something very few people use.
My suspicion is that all this memory-overhead has performance
implications as well? Allocating and deallocating memory for all of
these repeated property-names, it can't be very efficient?
I don't know much about the inner workings of PHP or C in general, but
if the property-names are in deed stored repeatedly, and if the
string-type uses a pointer, wouldn't it possible to point all of the
property-name strings to same address in memory, sharing the
property-name strings, instead of storing them repeatedly?
Just a thought...
Hi!
How come it's necessary to store the property-names of every property
in every object? For properties that have been defined in classes, why
can't they be stored in a more efficient manner? (using lookup tables)
No because you can add and remove properties freely at runtime.
I know the nature of PHP is dynamic, and I know that dynamic
properties would have to be stored in a key/value form internally...
but if you look at modern PHP software, dynamic properties is actually
something very few people use.
This is not true.
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227
Yeah, dynamic properties get used by default every time you
json_decode something, to take a random example.
String folding could be used, but that would require a hashtable
lookup and would probably be slower than allocation (at least until
you started to swap). Worth testing maybe.
Or... when you instantiate an object, it still looks like a hash, but
declared property names are initialized to point at shared strings.
Dynamic property names aren't. When the object is reclaimed, the
declared properties are removed first to prevent double deallocations,
and then the dynamic properties are destroyed.
I'm suggesting a lot of work here I'm sure, but this latter idea seems
like it might yield a large memory usage improvement.
Hi!
How come it's necessary to store the property-names of every property
in every object? For properties that have been defined in classes, why
can't they be stored in a more efficient manner? (using lookup tables)No because you can add and remove properties freely at runtime.
I know the nature of PHP is dynamic, and I know that dynamic
properties would have to be stored in a key/value form internally...
but if you look at modern PHP software, dynamic properties is actually
something very few people use.This is not true.
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227--
--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com
No offense intended, but if you've got so many OOP objects flying
around that they are sucking down that much memory...
You probably need to refactor your code and just "don't do that"
Just my opinion.
--
brain cancer update:
http://richardlynch.blogspot.com/search/label/brain%20tumor
Donate:
https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=FS9NLTNEEKWBE
Actually, I just updated Rasmus' demo program to use assocative arrays
instead of objects.
In PHP 5.4.x, the associative array version uses more memory than the
object oriented version.
That's because PHP 5.4.x is using a flat array for predeclared
properties, as was mentioned earlier by Gustavo.
Associative arrays:
524288 bytes
Objects:
262144 bytes
The object-oriented version is also faster, by about 20%.
Interestingly these results don't change much if I make the property
names/keys much shorter. Probably there's a minimum allocation of 64
bytes for these or something.
It would appear there is no longer a penalty simply for using many
objects vs. many associative arrays in PHP 5.4. The opposite, in fact.
I'm sure arrays didn't get slower, but objects now take advantage of
some optimizations that become possible when properties are
predeclared.
However this doesn't mean that calling lots of setters will
necessarily be as fast as direct property access... oh what the heck,
let's test that too:
Calling dead-simple setters for the four properties rather than
setting them directly slows down the OOP version to the point where it
runs at just about the same speed as the associative array version.
That's not terrible. Direct property access is still fastest (after
all that's what the setters do after they pay the overhead of the
function call).
No offense intended, but if you've got so many OOP objects flying
around that they are sucking down that much memory...You probably need to refactor your code and just "don't do that"
Just my opinion.
--
brain cancer update:
http://richardlynch.blogspot.com/search/label/brain%20tumor
Donate:
https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=FS9NLTNEEKWBE--
--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com
Adding/removing properties at runtime is great if you want obscure,
unmaintainable code and don't think an IDE is useful.
So to make my previous statement more precise, dynamic properties are
not widely used in respectable modern codebases, and is generally
something a reputable developer would frown upon. Yes, some codebases
(e.g. Drupal) rely on this extensively, but modern frameworks
generally do not - in fact, they often take measures to ensure that
exceptions are thrown if you try to access a property that has not
been formally defined.
For that matter, most ORMs (a typical example of where dynamic
properties would come in handy) don't rely on dynamic properties
either - they rely on __get() and __set() and store the actual values
in a single property, as an array. So even for highly dynamic
components in modern frameworks, this is not a feature that is often
used.
Drupal-development aside, and perhaps some XOOPS-development back in
the dark ages, I can't actually recall when I've used dynamic
properties.
I suddenly realize why certain heavily-used classes in the Yii
framework have obscure property-names like $_m and $_p ... they're
trying to save memory. Not really logical that you should have to
compromise legible code in favor of saving memory.
Makes me wonder if this issue could be addressed, killing two birds
with one stone. Since the dynamic aspect is an inconvenience to many
developers, and since it causes memory-overhead whether they make use
of this feature or not, how about introducing a non-dynamic
alternative base-class that actually throws if you access properties
that have not been defined?
This wouldn't change the way PHP objects work by default, but would
lighten the memory-overhead in a lot of modern frameworks, and
possibly speed up property-access too, since you can have a flat
look-up table for class-properties; and could eliminate the need for
an "object" or "component" base-class in frameworks...
Hi!
How come it's necessary to store the property-names of every property
in every object? For properties that have been defined in classes, why
can't they be stored in a more efficient manner? (using lookup tables)No because you can add and remove properties freely at runtime.
I know the nature of PHP is dynamic, and I know that dynamic
properties would have to be stored in a key/value form internally...
but if you look at modern PHP software, dynamic properties is actually
something very few people use.This is not true.
Rasmus, isn't your concern about the impact of dynamic property
support on developers who don't actually use it a nonissue in 5.4,
where properties that aren't dynamic are stored as a flat array?
Adding/removing properties at runtime is great if you want obscure,
unmaintainable code and don't think an IDE is useful.So to make my previous statement more precise, dynamic properties are
not widely used in respectable modern codebases, and is generally
something a reputable developer would frown upon. Yes, some codebases
(e.g. Drupal) rely on this extensively, but modern frameworks
generally do not - in fact, they often take measures to ensure that
exceptions are thrown if you try to access a property that has not
been formally defined.For that matter, most ORMs (a typical example of where dynamic
properties would come in handy) don't rely on dynamic properties
either - they rely on __get() and __set() and store the actual values
in a single property, as an array. So even for highly dynamic
components in modern frameworks, this is not a feature that is often
used.Drupal-development aside, and perhaps some XOOPS-development back in
the dark ages, I can't actually recall when I've used dynamic
properties.I suddenly realize why certain heavily-used classes in the Yii
framework have obscure property-names like $_m and $_p ... they're
trying to save memory. Not really logical that you should have to
compromise legible code in favor of saving memory.Makes me wonder if this issue could be addressed, killing two birds
with one stone. Since the dynamic aspect is an inconvenience to many
developers, and since it causes memory-overhead whether they make use
of this feature or not, how about introducing a non-dynamic
alternative base-class that actually throws if you access properties
that have not been defined?This wouldn't change the way PHP objects work by default, but would
lighten the memory-overhead in a lot of modern frameworks, and
possibly speed up property-access too, since you can have a flat
look-up table for class-properties; and could eliminate the need for
an "object" or "component" base-class in frameworks...Hi!
How come it's necessary to store the property-names of every property
in every object? For properties that have been defined in classes, why
can't they be stored in a more efficient manner? (using lookup tables)No because you can add and remove properties freely at runtime.
I know the nature of PHP is dynamic, and I know that dynamic
properties would have to be stored in a key/value form internally...
but if you look at modern PHP software, dynamic properties is actually
something very few people use.This is not true.
--
--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com
What about a magic interface instead of a new base class, in a similar
vein to the existing Array Access and Serializable interfaces.
NonDynamic perhaps?
Rasmus, isn't your concern about the impact of dynamic property
support on developers who don't actually use it a nonissue in 5.4,
where properties that aren't dynamic are stored as a flat array?Adding/removing properties at runtime is great if you want obscure,
unmaintainable code and don't think an IDE is useful.So to make my previous statement more precise, dynamic properties are
not widely used in respectable modern codebases, and is generally
something a reputable developer would frown upon. Yes, some codebases
(e.g. Drupal) rely on this extensively, but modern frameworks
generally do not - in fact, they often take measures to ensure that
exceptions are thrown if you try to access a property that has not
been formally defined.For that matter, most ORMs (a typical example of where dynamic
properties would come in handy) don't rely on dynamic properties
either - they rely on __get() and __set() and store the actual values
in a single property, as an array. So even for highly dynamic
components in modern frameworks, this is not a feature that is often
used.Drupal-development aside, and perhaps some XOOPS-development back in
the dark ages, I can't actually recall when I've used dynamic
properties.I suddenly realize why certain heavily-used classes in the Yii
framework have obscure property-names like $_m and $_p ... they're
trying to save memory. Not really logical that you should have to
compromise legible code in favor of saving memory.Makes me wonder if this issue could be addressed, killing two birds
with one stone. Since the dynamic aspect is an inconvenience to many
developers, and since it causes memory-overhead whether they make use
of this feature or not, how about introducing a non-dynamic
alternative base-class that actually throws if you access properties
that have not been defined?This wouldn't change the way PHP objects work by default, but would
lighten the memory-overhead in a lot of modern frameworks, and
possibly speed up property-access too, since you can have a flat
look-up table for class-properties; and could eliminate the need for
an "object" or "component" base-class in frameworks...Hi!
How come it's necessary to store the property-names of every property
in every object? For properties that have been defined in classes, why
can't they be stored in a more efficient manner? (using lookup tables)No because you can add and remove properties freely at runtime.
I know the nature of PHP is dynamic, and I know that dynamic
properties would have to be stored in a key/value form internally...
but if you look at modern PHP software, dynamic properties is actually
something very few people use.This is not true.
--
--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com
On Mon, 21 May 2012 20:47:51 +0200, Rasmus Schultz rasmus@mindplay.dk
wrote:
I just realized something that never occurred to me before - every
property is actually stored as a hash.This test-script will demonstrate:
[snip]
The test-script contains no information about the version of PHP you're
using. Starting with PHP 5.4, the properties hash table is only created if
you're storing dynamic properties (i.e. assigning undeclared properties)
or if it otherwise requested. Otherwise, they're stored in an array.
--
Gustavo Lopes
Thanks for clarifying that. Sounds like a huge win.
On Mon, 21 May 2012 20:47:51 +0200, Rasmus Schultz rasmus@mindplay.dk
wrote:I just realized something that never occurred to me before - every
property is actually stored as a hash.This test-script will demonstrate:
[snip]
The test-script contains no information about the version of PHP you're
using. Starting with PHP 5.4, the properties hash table is only created if
you're storing dynamic properties (i.e. assigning undeclared properties) or
if it otherwise requested. Otherwise, they're stored in an array.--
Gustavo Lopes--
--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com
I ran this script on 5.3.13, and it reported:
786432 bytes used
On 5.4.3, it reported:
262144 bytes used
That is definitely a significant improvement.
Objects are still a lot bigger than their contents. I don't expect
they would ever shrink to the size of their contents exactly or even
all that close of course.
Thanks for clarifying that. Sounds like a huge win.
On Mon, 21 May 2012 20:47:51 +0200, Rasmus Schultz rasmus@mindplay.dk
wrote:I just realized something that never occurred to me before - every
property is actually stored as a hash.This test-script will demonstrate:
[snip]
The test-script contains no information about the version of PHP you're
using. Starting with PHP 5.4, the properties hash table is only created if
you're storing dynamic properties (i.e. assigning undeclared properties) or
if it otherwise requested. Otherwise, they're stored in an array.--
Gustavo Lopes--
--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com
--
Tom Boutell
P'unk Avenue
215 755 1330
punkave.com
window.punkave.com
Hi!
262144 bytes used
That is definitely a significant improvement.
Objects are still a lot bigger than their contents. I don't expect
they would ever shrink to the size of their contents exactly or even
all that close of course.
Hashtables and zvals have overhead. So if you store 4-byte value, you have:
zval wrapper - 16 bytes
allocation unit - 8 bytes IIRC
hashtable bucket - 36 bytes
So you are already at 60 bytes per value. That's on 32-bit, on 64-bit
due to pointers and longs being wider, probably almost double that. Then
also add the storage for the key itself.
If you need more memory-efficient data storage, something like
SplFixedArray may help.
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227