Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:89124 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 38800 invoked from network); 9 Nov 2015 16:42:33 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 9 Nov 2015 16:42:33 -0000 Authentication-Results: pb1.pair.com smtp.mail=me@kelunik.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=me@kelunik.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain kelunik.com from 81.169.146.217 cause and error) X-PHP-List-Original-Sender: me@kelunik.com X-Host-Fingerprint: 81.169.146.217 mo4-p00-ob.smtp.rzone.de Received: from [81.169.146.217] ([81.169.146.217:50135] helo=mo4-p00-ob.smtp.rzone.de) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 93/D2-13667-6FCC0465 for ; Mon, 09 Nov 2015 11:42:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1447087347; l=10837; s=domk; d=kelunik.com; h=Content-Type:Cc:To:From:Subject:Date:References:In-Reply-To: MIME-Version; bh=SvNmav5dnwyn943NbItYcp1pNRzP7L3MX742T8vAH+0=; b=roWSjJqYaaR15iA6YKTaJxIp3u0SzYa4P6ysGSuD+zvzsKLTAHIdOkxpLXfp+MRORY7 T9hbxrnIVGaJmIcGmu/pGW/OxAk24L8bc9tWuL17dGFm1YCehNtwy4TqJ0K1ZtSt0123E BsBp8j1clMmfb9Hx/5MgVPUXDxCEWMjBPb0= X-RZG-AUTH: :IWkkfkWkbvHsXQGmRYmUo9mls2vWuiu+7SLGvomb4bl9EfHtO3o6 X-RZG-CLASS-ID: mo00 Received: from mail-wm0-f49.google.com ([74.125.82.49]) by smtp.strato.de (RZmta 37.14 AUTH) with ESMTPSA id n04d91rA9GgQsDQ (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (curve secp384r1 with 384 ECDH bits, eq. 7680 bits RSA)) (Client did not present a certificate) for ; Mon, 9 Nov 2015 17:42:26 +0100 (CET) Received: by wmnn186 with SMTP id n186so114223026wmn.1 for ; Mon, 09 Nov 2015 08:42:26 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.194.11.67 with SMTP id o3mr10473288wjb.3.1447087346740; Mon, 09 Nov 2015 08:42:26 -0800 (PST) Received: by 10.194.2.171 with HTTP; Mon, 9 Nov 2015 08:42:26 -0800 (PST) In-Reply-To: References: <563B6ED1.1030601@gmail.com> Date: Mon, 9 Nov 2015 17:42:26 +0100 X-Gmail-Original-Message-ID: Message-ID: To: Steven Hilder Cc: PHP Internals , Leigh , Rowan Collins , Joe Watkins , Dmitry Stogov Content-Type: multipart/alternative; boundary=047d7b45105e42fe2f05241e4710 Subject: Re: [PHP-DEV] Null bytes in anonymous class names From: me@kelunik.com (Niklas Keller) --047d7b45105e42fe2f05241e4710 Content-Type: text/plain; charset=UTF-8 2015-11-09 16:27 GMT+01:00 Steven Hilder < steven.hilder@sevenpercent.solutions>: > On Thu, 05 Nov 2015 15:14:47, Leigh wrote: > >> On 5 November 2015 at 14:59, Rowan Collins >> wrote: >> >>> PHP uses null bytes quite a lot to produce deliberately illegal >>> identifiers. For instance the old eval-like create_function() [e.g. >>> https://3v4l.org/hqHjh] and the serialization of private members [e.g. >>> https://3v4l.org/R6Y6k] >>> >>> In this case, I guess the "@" in "class@anonymous" makes the name >>> illegal >>> anyway, but I'm not sold on the null byte being more unacceptable here >>> than anywhere else. >>> >>> Regards, >>> >>> -- >>> Rowan Collins >>> [IMSoP] >>> >> >> That doesn't mean it's a good approach (*cough* namespaces *cough*), and >> these bits of "magic" are supposed to be hidden away from users. I'm >> guessing in this particular instance, the point of the null is to make >> string operations cut off after "anonymous", however string operations >> that respect the zval string length aren't going to do this. >> >> e.g. var_dump() the class name is put through sprintf and it cuts off at >> the null, but get_class or ReflectionClass::getName() just returns the >> original string, and exposes the implementation details. >> > > Internal names for anonymous classes need to be unique. > > The current method of ensuring uniqueness ('\0' + filename + lexer pos) was > written by Dmitry[1], following from Joe's original implementation[2]. As I > understand it, this was supposed to be entirely hidden from userland; hence > the null byte. > > So, I prepared a patch for `get_class()` and `ReflectionClass::getName()`, > which in my view should behave the same way as `var_dump()` etc., but I've > now realised that ignoring the unique suffix from the class name will break > functionality that is otherwise desirable: > > * Aliasing an anonymous class... (see bug70106[3]) > > $instance = new class {}; > $class_name = get_class($instance); > class_alias($class_name, 'MyAlias'); > > * Creating a new anonymous class instance dynamically... > > $instance = new class {}; > $class_name = get_class($instance); > $new_instance = new $class_name; > > ...although the `get_class()` used here isn't necessary, and you could > write `= new $instance;` without ever knowing $instance's class. > > Our options seem to be: > > (A) Leave everything as it is. This puts us in a situation where anonymous > class names are treated inconsistently in userland; a situation that I > think could lead to considerable confusion about the "real" names of > these classes and how they should be used. > > (B) Patch `get_class()` and `ReflectionClass::getName()` (any others?) to > strip the suffix (the '\0' and everything after it) from anonymous > class > names prior to display. This prevents userland code from distinguishing > between multiple anonymous classes; breaking functionality as described > above. To mitigate this, we could allow `class_alias()` to accept > either > a class name OR an instance in the first argument. The `new` operator > already supports an instance in place of a class name. > > (C) Assuming that userland SHOULD have access to the internal names of > anonymous classes, remove the null byte so that the full names are > displayed everywhere. The drawback of this option is that class names > won't be as predictable. It occurs to me that a different naming scheme > could help here - wouldn't a monotonically increasing integer suffix be > enough? The current scheme seems like an awful waste of memory, too. > > Option (B) seems like the best to me, because I can't really see a need to > know the unique suffix -- provided any use-cases are catered for by an > alternative (modifying `class_alias()` should be enough). > > Thoughts? > > Steve > > [1] http://git.php.net/?p=php-src.git;a=commitdiff;h=2a1b0b1#patch1 > [2] http://git.php.net/?p=php-src.git;a=commitdiff;h=49608e0#patch12 > [3] https://bugs.php.net/bug.php?id=70106 > Having the path info is quite useful for debugging purposes. Regards, Niklas --047d7b45105e42fe2f05241e4710--