Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:42721 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 19970 invoked from network); 20 Jan 2009 09:28:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 20 Jan 2009 09:28:08 -0000 Authentication-Results: pb1.pair.com smtp.mail=ekneuss@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=ekneuss@gmail.com; sender-id=pass; domainkeys=bad Received-SPF: pass (pb1.pair.com: domain gmail.com designates 72.14.220.158 as permitted sender) DomainKey-Status: bad X-DomainKeys: Ecelerity dk_validate implementing draft-delany-domainkeys-base-01 X-PHP-List-Original-Sender: ekneuss@gmail.com X-Host-Fingerprint: 72.14.220.158 fg-out-1718.google.com Received: from [72.14.220.158] ([72.14.220.158:3976] helo=fg-out-1718.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 02/12-07734-62995794 for ; Tue, 20 Jan 2009 04:28:06 -0500 Received: by fg-out-1718.google.com with SMTP id 16so1331803fgg.23 for ; Tue, 20 Jan 2009 01:28:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:cc:in-reply-to:mime-version:content-type :content-transfer-encoding:content-disposition:references :x-google-sender-auth; bh=Mc91B0oPNvcJBzDX70C765Q12oDJ2mbQC5QaHdqI3Eg=; b=OEFf/ei9fveyBWKCMaTvDLi56nVbwPvP6fNlVYNGX6W/lgohrOW5/7x5OCjwP5++Sq uNl532+DPkBw9sAxgpNqIV07OOrjW5zMsFTFslJoJw2WNUDEAPxv7zlnvhyfzjmCYI1T GiNHCC1T2i1/AXJuuloaqCO+736JdDD9opgIo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=MMXYqPDCkORpNBoO/nq4wo50PYTnG/L8lbLXIIu6cEVjNNEK/ZxIOk8zJmSeWtrTpm tmjY9oLmUsEZfCbmPv6bQlVNCKXuCD+FRxqx5YIXXErUOrW+Ub7Lnxe2Kl2uHqZU/71T NMbammIxu2wbWqcqpph0dm7oStxAuok20Uqfo= Received: by 10.86.94.11 with SMTP id r11mr1912092fgb.11.1232443683496; Tue, 20 Jan 2009 01:28:03 -0800 (PST) Received: by 10.86.35.9 with HTTP; Tue, 20 Jan 2009 01:28:03 -0800 (PST) Message-ID: Date: Tue, 20 Jan 2009 10:28:03 +0100 Sender: ekneuss@gmail.com To: "Guilherme Blanco" Cc: "Marcus Boerger" , "Lars Strojny" , "internals Mailing List" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <1229538590.4955.15.camel@localhost> <28434894.20081217221911@marcus-boerger.de> X-Google-Sender-Auth: ac5aa53eb5024d4f Subject: Re: [PHP-DEV] New function proposal: spl_object_id From: webmaster@colder.ch ("Etienne Kneuss") Hello, We already had that discussion in private, but here is a on-list summary: On Mon, Jan 19, 2009 at 5:39 PM, Guilherme Blanco wrote: > Ok, > > We'll use this method inside Doctrine ORM version 2.0, scheduled to be > released on September 1st, 2009. > > One main location where we are already using it is during Hydration proce= ss. > The process of grab a DB tuple and convert it into an Object graph. > Here is the usage. > > Each Object of the graph is a Value Object > (http://en.wikipedia.org/wiki/Value_object). So it does not have any > other mapping else than to-be persisted ones. No internal method > implementation is needed. All Active Record like actions are > controlled by EntityManager. > > Based on that, we have a ClassMetadata that is catch based on class > name (currently based on spl_object_id, but it's too resources > expensive and I'll change that). When we get the DB tuple, we need to > find the exact ClassMetadata of that item and apply the specific > DB/PHP type castings for example. Also there's a property attribution. > Property attribution is thanks to new Reflection API. We store the > ReflectionProperty of each field and assign it when we have its > definition. > > Another location where we rely spl_object_id is inside UnitOfWork > (http://martinfowler.com/eaaCatalog/unitOfWork.html). We generate a > mapping of each Entity/Collection to be persisted/updated/deleted. We > define the order of appliance of these things based on first the > generated OID (spl_object_id return) and later by Topological Sorting > (http://en.wikipedia.org/wiki/Topological_sorting). Finally, we start > the transaction and the statements. > > The point is that we may have being doing a huge hydration with a lots > of relationed objects. We may be dealing with a webpage that fetches > for more than 5000 records with even more associations. All of that > runtime. So I have to say performance is something VERY important for > us. > > Why will we not use SplStorage? > Because it'll be used on different places and should share the same > OID. Including couple of this component is not a viable idea since > it'll go to a more memory expensive solution, which we're trying to > optimize a lot and also will force us to include another get call > (through method call), which will fall into an even slower > implementation. > > Here are two files that we have being using spl_object_id (changed now > to spl_object_hash, since the idea is to update it with Marcus' > suggestions): > Object Driver for Hydration: > http://trac.doctrine-project.org/browser/trunk/lib/Doctrine/ORM/Internal/= Hydration/ObjectDriver.php > UnitOfWork for Persistance: > http://trac.doctrine-project.org/browser/trunk/lib/Doctrine/ORM/UnitOfWor= k.php > > > > Short version: Because we want a fast, easy way to associate > information (temporarily) with an object. Most of the time we use the > object id/hash as a key in an array. Basically, spl_object_hash is > fine, it would just be nice if it could be improved in speed. > All those use cases are related to a [object =3D> data] map, which can be solved by SplObjectStorage: $storage =3D new SplObjectStorage; $storage[$obj1] =3D $data; ... var_dump($storage[$obj1]); ... There were three concerns: 1) Speed: the main ground for spl_object_id is speed. =3D> Splobjectstorage is faster than an array with spl_object_hash (and can be made even faster). 2) $storage[$obj1]['index'] =3D 2; This is sadly a limitation of ArrayAccess =3D> It can be solved either by doing get+change+set, or using an ArrayObject instead of an array. 3) Memory: Since the object itself will be referenced in the storage, you'll have to delete it from every maps in order for GC to do its work. =3D> This is a security, indeed, an object stays unique as long as it exists: $a =3D new StdClass; $h1 =3D spl_object_hash($a); unset($a); $b =3D new StdClass; $h2 =3D spl_object_hash($b) var_dump($h1=3D=3D=3D$h2); // bool(true) Conclusion: If you clean your objects without properly taking care of the metadata stored in the array indexed by object_id, you'll get unexpected results anyway. So far it looks like SplObjectStorage is fine with those use cases. If somebody has a practical (with code) use case in which SplObjectStorage can't be sanely used and where spl_object_id is the only solution, please shoot. > > > It'll take me some time to dig into PHP source to try to implement it. > I'm not a C developer and there're more than 4 years I didn't touch a > single line o C code. Also I can read PHP source, but I'm not able to > create it. > I already spoke with Felipe which will help me solving questions about > src, but I cannot guarantee I'll be able to do the job. > > > Regards, > > On Wed, Dec 17, 2008 at 7:19 PM, Marcus Boerger wrote: >> Hello Etienne, >> >> Wednesday, December 17, 2008, 7:59:01 PM, you wrote: >> >>> Hello, >> >>> On Wed, Dec 17, 2008 at 7:29 PM, Lars Strojny wrote: >>>> Hi Guilherme, >>>> >>>> thanks for moving the discussion to the list. >>>> >>>> Am Mittwoch, den 17.12.2008, 15:31 -0200 schrieb Guilherme Blanco: >>>> [...] >>>>> It seems that Marcus controls the commit access to SPL. So I'm turnin= g >>>>> the conversation async, since I cannot find him online at IRC. >>>>> So, can anyone review the patch, comment it and commit if approved? >>>> >>>> Just for clarification, it is not about access, but about maintenance. >>>> So if Marcus gives his go, we can happily apply the patch and add a fe= w >>>> tests (something you could start preparing now). >>>> >>>> cu, Lars >>>> >> >>> Last time I checked with Marcus, there were concerns about disclosing >>> a valid pointer to the user. >>> I'd be happy to see a use-case where this information is really needed >>> heavily. The only real usecase of heavy usages seems to be to >>> implement sets of objects. but splObjectStorage is here for that >>> precise use-case... >> >> Correct in all Etienne. The patch might be a tiny bit faster but exposes >> valid pointers which is extremely bad and also allows other bad things. >> That was the only reason I used md5 hashin. What I needed was something >> that is really unique per object (object pointer or id plus pointer to >> handler table). Since spl_object_hash() does not say how it creates the >> hash it should be fine change the way it does it. Since in a new session >> the hashes are of no more use we can even do that in any new version. >> However I must still insist on not exposing any valid information. >> >> Last but not least. In your code you know the maximum length of the >> extression, so you can allocate the string and snprintf into it. Even >> faster is to do a hexdump into a preallocated string. For the size use: >> char* hash =3D (char*)safe_emalloc(sizeof(void*), 2, 1); >> Now the dump of the two pointers. >> This approach should make it a bit faster for you. Something that might >> work is to create a random 128 bit hash key that is xored onto the hash >> created from the two pointers. This hash key can be allocated for each >> session the first time the function will be used. If you do that I am mo= re >> than happy to accept that as a replacement for current spl_object_hash()= . >> >> marcus >> >>> Regards >> >> >>> -- >>> Etienne Kneuss >>> http://www.colder.ch >> >>> Men never do evil so completely and cheerfully as >>> when they do it from a religious conviction. >>> -- Pascal >> >> >> >> >> Best regards, >> Marcus >> >> > > > > -- > Guilherme Blanco - Web Developer > CBC - Certified Bindows Consultant > Cell Phone: +55 (16) 9215-8480 > MSN: guilhermeblanco@hotmail.com > URL: http://blog.bisna.com > S=E3o Paulo - SP/Brazil > Regards, --=20 Etienne Kneuss http://www.colder.ch Men never do evil so completely and cheerfully as when they do it from a religious conviction. -- Pascal