Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:102649 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 22627 invoked from network); 8 Jul 2018 18:37:08 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 8 Jul 2018 18:37:08 -0000 Authentication-Results: pb1.pair.com header.from=nicolas.grekas@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=nicolas.grekas@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.218.44 as permitted sender) X-PHP-List-Original-Sender: nicolas.grekas@gmail.com X-Host-Fingerprint: 209.85.218.44 mail-oi0-f44.google.com Received: from [209.85.218.44] ([209.85.218.44:40120] helo=mail-oi0-f44.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 7E/BA-55607-1D9524B5 for ; Sun, 08 Jul 2018 14:37:07 -0400 Received: by mail-oi0-f44.google.com with SMTP id w126-v6so31997962oie.7 for ; Sun, 08 Jul 2018 11:37:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=So8h5DYCr9KvT8laDZah9vWkScbmVQkV2GN4qXEF1n8=; b=ILCkY6E4KyXNabx3uIw9tYjJBekecd+rbZU/ha4xjct2Yd1VINzfv58WxzFiLh54UK n2U9HHef4DWEVjziVAQMK148RexXQHataO4quaiORuU8FbzXmBwZUpFD3qcnEaTrEIDC xvwdrfPPuqnsaW9zEvMb8Z3Ww3y6bXNMQGGDfPBw5sMxdOPVCF75bkcFk57qSnfnlRFF z+DDQQTiYAnVmIKufXw+GQdQ1wxbK96ZeiH9qqBZkv/Ckl0hlUzuSuTceHBY/oe0H6i8 7AoAXw6kMI2oerzw/thO5bXb7aNzQpNUKKsmCHkkI6mFyF6+OHLZFlUytbka4DAPETl6 7RUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=So8h5DYCr9KvT8laDZah9vWkScbmVQkV2GN4qXEF1n8=; b=DwbLTUrfojZCTWOwQ4lQaZzuvkUoL2gTpTH5fxEfNSbBmxe+EHiTrYeYFft2vqm5ZH cQFdRCn7R59+BHAq7wLk677sxPypX9vYptcEQT164c8etH4ScjqU3qwk4eVTW+Ul3WwP lPRGab2QpMcQMD2gEkvWt+AtHCAN1GpN5Lk3/FHd888v+JjYrFgZZKSGlMKjnGNeenT1 uoIjljIkfH44IHX/Pq+CB4SxIiHHIuz3Z9/vvtZH8wm7vE21i+pR6URFH5p/XqSTENQA 1fpEA/7WEfnf9e0Rf6C+UFX+Xjzlskiqt47QKnQznEzgbGa2zOSWXY2a90L2e98dm/Uk Qq+w== X-Gm-Message-State: APt69E03+uaXXkbJQL/4SYkCrlJW5M8Iu4rOUstm7WMYn16l8VQWLB8A 8vy2aJvcHlIXMCf7YJEW6gTGN5tyEjYdJoBIPao= X-Google-Smtp-Source: AAOMgpeXkgrJIiuXoooW39eepWQ8AhS3o0N1nLI+VXFvjhdVcEdJjcv8DPtaxe/SuY8y6INlMQpVqMAVl+vWO61Qau0= X-Received: by 2002:aca:42:: with SMTP id 63-v6mr18293620oia.154.1531075022869; Sun, 08 Jul 2018 11:37:02 -0700 (PDT) MIME-Version: 1.0 Sender: nicolas.grekas@gmail.com Received: by 2002:a4a:1e45:0:0:0:0:0 with HTTP; Sun, 8 Jul 2018 11:36:42 -0700 (PDT) In-Reply-To: References: Date: Sun, 8 Jul 2018 20:36:42 +0200 X-Google-Sender-Auth: tlCGPVZCy5rjG4QAwMFN04obyMk Message-ID: To: Nikita Popov Cc: PHP internals Content-Type: multipart/alternative; boundary="000000000000dcdac40570812ebe" Subject: Re: [PHP-DEV] Introspection for references From: nicolas.grekas+php@gmail.com (Nicolas Grekas) --000000000000dcdac40570812ebe Content-Type: text/plain; charset="UTF-8" > Before talking about solutions, can the people who need this first outline >>> what functionality is needed and what it is needed for (and maybe what >>> workarounds you currently use). E.g. do you only need to know whether >>> something is a reference, or do you need to know whether two somethings >>> are >>> part of the same reference, etc. There are probably multiple use cases >>> for >>> this with different needs. >>> >> >> We're using reference introspection to do both: we need to know when a >> zval is a reference, and we also need to track each of them separately. >> >> The use case is being able to intropect any arbitrary PHP datastructure, >> with one main application: providing an enhanced "dump()" function. >> >> See e.g. this screenshot for what we get using the dump() function >> provided by Symfony VarDumper component: >> https://symfony.com/doc/current/_images/07-hard-ref.png >> >> In PHP5 days, Julien Pauli wrote a PHP extension to do zval introspection. >> Here is the code + README (see test case 001.phpt for example with >> references.): >> https://github.com/symfony/symfony/tree/3.4/src/Symfony/Comp >> onent/Debug/Resources/ext >> >> With PHP7, using pure PHP introspection is easier to maintain and still >> very fast so we deprecated the extension. >> Here is the code doing reference introspection: >> https://github.com/symfony/symfony/blob/master/src/Symfony/C >> omponent/VarDumper/Cloner/VarCloner.php#L83 >> >> it might not be easy to follow, but the basic blocks are: >> >> $array2 = $array1; >> $array2[$key] = $unique_cookie; >> if ($array1[$key] === $unique_cookie) => we found a reference >> then we also maintain a registry of $unique_cookie so that we know if we >> already saw that reference or not (the check is done before the above "if" >> or course.) >> > > Thanks for the explanation. I think that the VarCloner use case needs two > bits of functionality: > > 1. Detecting whether a variable is a reference, so you can handle this > specially. > 2. An efficient way of determining whether a variable is part of a > reference that has already been seen (and which one). > > The second requirement is stronger than just the ability to detect whether > two variables are part of the same reference. Given just a same_ref($v1, > $v2) function, one would have to check against a list of all previously > seen references one at a time, rather than only performing a hashtable > lookup. > > Currently this functionality is implemented as: > > 1. Copying the array, assigning a cookie to the copy and seeing if the > original array is modified. With an extra catch for TypeErrors, this is > compatible with typed properties. > 2. Replacing the reference with a Stub object, which can be looked up by > object id. At the end the Stub objects are replaced with their values > again. This is fundamentally incompatible with typed properties, as the > type will likely not permit the Stub class. > > Here are my thoughts on possible APIs for this use case. > > Construction of reference-reflection objects > ----- > > An issue already discussed in the other threads is that in PHP we need to > specify whether a parameter is accepted by reference, by value or by > preferred-reference. We don't have the possibility of accepting either a > value or reference, whatever we get. This leaves us with a few options: > > 1. Introducing a VM-level primitive that is not subject to this > limitation. The typed properties thread suggested a reflect_variable() > language construct. I'm not too fond of this option because reference > reflection seems like an awfully specific thing to introduce a new language > construct for. > > 2. A ReflectionReference::fromVariable(&$var) constructor. Contrary to > what was said in the other thread, this does not cause issues with the > copy-on-write mechanism. Since PHP 7 references and non-references can > share values (including immutablized values in SHM). However, this approach > does have two issues: > a) It is impossible to distinguish whether $var was a singleton reference > or a value beforehand. Both will show up as rc=2 references inside > ReflectionReference::fromVariable(). (This may also be an advantage, > because from a language-design perspective, we treat singleton references > as non-references.) > b) In case the original $var was a variable, it will now be a reference, > so this has a side-effect. > > 3. A ReflectionReference::fromArrayElem(array $array, string|int $key) > constructor, as suggested by Nicolas. This avoids the reference/value > problem and solves the specific VarCloner case efficiently and directly. On > the other hand, introspection of references inside non-arrays requires some > workarounds (e.g, casting objects to arrays). > > 4. A combination of these. For example we could have... > ... ReflectionReference::fromArrayElem(array $array, string|int $key) for > array items. > ... ReflectionReference::fromObjectProp(object $object, string $key) for > object properties. > ... ReflectionReference::fromVariable(&$var) for any other special cases. > This would allow to cover the common and interesting cases with > specialized methods, and leave a less efficient fallback for the general > case. This is probably the option I'd favor. > Either 3. or 4. would be good to me. IMHO 3. is enough, because : - ReflectionReference::fromObjectProp() => there need to be a way to deal with visibility, e.g. accessing private ones. Maybe get the ReflectionReference from a ReflectionProperty instead? Or just the array-cast is enought? - ReflectionReference::fromVariable() => honestly, I don't see any use of local scope introspection. And if there is one, getting it as an array first is always possible, so you might prefer less complexity here (i.e not support this constructor) > Determining whether something is a reference > ----- > > I think the best way to handle this (and the reason why I used named > constructors above) is to return null if the value is not a reference. This > should be the most common case and it would be best to avoid the overhead > of constructing an unnecessary object in this case. > > One important question in this context would be whether we consider > singleton references as references or not. If we do, then the > ReflectionReference::fromVariable() constructor will always return a > non-null value, as the variables will be turned into a singleton reference > if it was a reference. If we consider them as references, we'll also want > an API method to distinguish them. E.g. a specialized isSingleton() or more > generally getNumUsers() == 1. > > The alternative would be to always construct a ReflectionReference object > which may or may not be a reference and has an isReference() method. I > don't see any advantages to that approach though. > If we're seeking for a benefit, it would be by turning this to a ReflectionZval insteaf of ReflectionReference. There would be isRef, but also getType, etc. But honeslty I'm with you on this, only refs make sense. Reference equality > ----- > > A couple of approaches: > > 1. Have an isEqual(ReflectionReference $other): bool method, which > determined whether two references are the same. The disadvantage is that > this only allows pair-wise comparisons, so it does not fully solve the > VarCloner use-case. > > 2. Make ReflectionReference constructor uniquing. That is, if a > ReflectionReference for a certain reference already exists, then the > constructor will return the same object. This means that references can be > compared by identity $ref1 === $ref2. It also means that they can be used > in hashtables via spl_object_id(). (Caveat: It's important to keep the > ReflectionReference object alive for the during in which spl_object_id() is > used, as usual.) > > 3. Some variation on 2 via a separate API. That is don't unique > ReflectionReferences themselves, but provide a separate getId() API. The > returned ID would only be meaningful as long as at least one > ReflectionReference object for the reference is live, otherwise it may be > reused. > 2. would work, as I could use that with an SplObjectStorage (which would satisfy your condition of keeping the object around). There is a 4th possibility: have a "ReflectionReference::getHash" method, that would return a unique string/id per reference (what symfony_zval_debug() does.) Actual API > ----- > > If we go with null return value on non-reference and uniquing, then most > of the functionality is already provided by the constructor. The only > useful API method I can think of is something like getNumUsers(). > Maybe some way to introspect the types bound to a reference? I don't have any use case yet so not sure I would be useful to build anything. Nicolas --000000000000dcdac40570812ebe--