Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:64551 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 46932 invoked from network); 5 Jan 2013 13:58:29 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 5 Jan 2013 13:58:29 -0000 Authentication-Results: pb1.pair.com header.from=inefedor@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=inefedor@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.215.53 as permitted sender) X-PHP-List-Original-Sender: inefedor@gmail.com X-Host-Fingerprint: 209.85.215.53 mail-la0-f53.google.com Received: from [209.85.215.53] ([209.85.215.53:45356] helo=mail-la0-f53.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 5E/D0-38386-48138E05 for ; Sat, 05 Jan 2013 08:58:29 -0500 Received: by mail-la0-f53.google.com with SMTP id fn20so11531154lab.12 for ; Sat, 05 Jan 2013 05:58:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:content-type:to:cc:subject:references:date:mime-version :from:message-id:in-reply-to:user-agent; bh=keWwQ9jCNO4pyy8/H6N4Ksprv77jYApe401SH6HxAV0=; b=0NZSG2E29TYqIzdTfAEYrnDleStuPO4Cd4gLDHMlG2c3gSBWmN64KNY3+WhlqhUJGt ZANGxVeau1FKHejmp0g8oNoxhRBTE/kOKuqYeSRPsgbnFcjyH7PH7IJbb8eZiP3At5Ov LC/tQdTMwtrT+dQ5+eknAA64KU/nmnCMv98d1s3Cl5n+sTW10eONhx9knpHABdgLlkDX 69JqnMo8NHrkyRaYu0V7vFa/dWmXxUagVqtKWet72vvbYkjzgy1WsPIXyaniXMt1P2wL HYT9iBCPVYm/o+XXb5qpz9PWYW68bm/lkq6NHIP3T0lL2MCFHYS04rNZXkM+L8knPKDw MVug== X-Received: by 10.152.134.167 with SMTP id pl7mr4968403lab.36.1357394305054; Sat, 05 Jan 2013 05:58:25 -0800 (PST) Received: from nikita2206-debian (128-72-169-108.broadband.corbina.ru. [128.72.169.108]) by mx.google.com with ESMTPS id ne2sm20457939lab.10.2013.01.05.05.58.23 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 05 Jan 2013 05:58:24 -0800 (PST) Content-Type: multipart/alternative; boundary=----------ccAp9mj749lWeYY6ysj4bw To: "Nikita Popov" Cc: "internals@lists.php.net" References: Date: Sat, 05 Jan 2013 17:58:19 -0000 MIME-Version: 1.0 Message-ID: In-Reply-To: User-Agent: Opera Mail/12.12 (Linux) Subject: Re: [PHP-DEV] Ruby's symbols From: inefedor@gmail.com ("Nikita Nefedov") ------------ccAp9mj749lWeYY6ysj4bw Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit On Sat, 05 Jan 2013 12:21:26 -0000, Nikita Popov wrote: > On Sat, Jan 5, 2013 at 3:07 PM, Nikita Nefedov > wrote: >> What symbols can give: >> 1. More convenient way to use it almost everywhere as a replacement for >> strings and sometimes for constants. >>There's a lot of code that uses >> arrays as a parameter-stores. For example, here's how you usually >> define a form >>in Symfony2: >> $builder >> ->add('title', 'text', array( >> 'label' => 'Album title' >> )) >> ->add('title_alias', 'text', array( >> 'label' => 'Album alias', >> 'required' => false, >> 'property_path' => 'properties.titleAlias' >> )) >> ->add('comment', 'text', array( >> 'label' => 'Comment', >> 'required' => false, >> 'property_path' => 'properties.comment' >> )) >> ->add('labels', 'text', array( >> 'label' => 'Musical labels', >> 'required' => false, >> 'property_path' => 'properties.labels' >> )) >> ->add('language', 'text', array( >> 'required' => false, >> 'property_path' => 'properties.language' >> )) >> It could be improved this way: >> $builder >> ->add('title', :text, array( >> :label => 'Album title' >> )) >> ->add('title_alias', :text, array( >> :label => 'Album alias', >> :required => false, >> :property_path => 'properties.titleAlias' >> )) >> ->add('comment', :text, array( >> :label => 'Comment', >> :required => false, >> :property_path => 'properties.comment' >> )) >> ->add('labels', :text, array( >> :label => 'Musical labels', >> :required => false, >> :property_path => 'properties.labels' >> )) >> ->add('language', :text, array( >> :required => false, >> :property_path => 'properties.language' >> )) >> 2. Memory usage reduction. AFAIK, there's a lot of cases like the above >> and the use of symbols would affect on >>memory usage significantly. >> 3. Memory leaks. Symbols can't be just garbage collected as, for >> example zvals, it's not possible. But we can do >>it for every request, >> so I don't think it would be problem. >> 4. Autocompletion from IDEs. > > Hi Nikita! > > I don't quite understand what those symbols would actually be good for. > If it's just for saving exactly one >character for array keys (:foo vs > "foo"), then this isn't worth it. If this is about memory savings, then > I don't >think it will help at all. PHP uses interned strings, so all > those "label" etc strings in your above example >actually use the same > string value. The hash for those strings is also precomputed, so symbol > don't have a >performance advantage either. Regarding your fourth point, > autocompletion is available for string array keys at >least in PhpStorm > and I guess also in other IDEs. > > So, I don't yet really get what the point behind the symbols is. > > Thanks, > Nikita Hi, yes, you are right about interned strings, I didn't know it. Actually my personal opinion is that strings should be used to store data (as values), not to retrieve it (not as keys). But strings are more developer-friendly than anything else (because you don't need to define new constant or new enumerable member for adding new parameter on receiving side, and you always see what this parameter is about). So you can see Symbols as enumerable that don't need to be initialized. There's actually no technical reason behind that. Though there would be a little speed-up because with Symbols array's Buckets will keep numeric key, so instead of memcmp you will need to just compare two longs when retrieving element. Actually this is looks a little bad now, AFAIK this is what happens when you trying to receive value from array by string key: Calling zend_new_interned_string_int for interning or getting already interned same string, receiving pointer to the stored string from it: Hash the string (O(n)) Retrieve bucket from arBuckets Find needed bucket by iterating over all retrieved buckets (over *pLast) and comparing its keys with memcmp If found - return pointer to string, else create new bucket... Now that we have an interned string, we can try to retrieve value from array with string: Hash the string again (O(n)) Retrieve bucket by hash fro arBuckets And again memcmp used for comparing strings This could be improved with Symbols, so that you won't need hash string twice and use memcmp. BTW do we really need a doubly linked list for interned strings (pListLast, pListNext) and all the extra members from HashTable/Bucket that needed for arrays? I know this is offtopic and these structs are used everywhere in PHP (because of DRY), but there are some places like this or class tables where we don't need arrays (PHP's arrays) functionality. ------------ccAp9mj749lWeYY6ysj4bw Content-Type: multipart/related; boundary=----------ccAp9mj749lWeYWrgAmeyg ------------ccAp9mj749lWeYWrgAmeyg Content-Type: text/html; charset=utf-8 Content-ID: Content-Transfer-Encoding: Quoted-Printable On Sat, 05 Jan 2013 12:21:26 -0000, Nikita Popov <nikita.ppv@gm= ail.com> wrote:

On Sat, Jan 5, 2013 at 3:07= PM, Nikita Nefedov <inefedor@gmail.com> wrote:
What symbols can give:
1. More convenient way to use it almost everywhere as a replacement for = strings and sometimes for constants. There's a lot of code that uses arr= ays as a parameter-stores. For example, here's how you usually define a = form in Symfony2:
        $builder
            ->add('title', 'text', arra= y(
                'label' =3D> = 'Album title'
            ))
            ->add('title_alias', 'text'= , array(
                'label' =3D> = 'Album alias',
                'required' =3D&g= t; false,
                'property_path' = =3D> 'properties.titleAlias'
            ))
            ->add('comment', 'text', ar= ray(
                'label' =3D> = 'Comment',
                'required' =3D&g= t; false,
                'property_path' = =3D> 'properties.comment'
            ))
            ->add('labels', 'text', arr= ay(
                'label' =3D> = 'Musical labels',
                'required' =3D&g= t; false,
                'property_path' = =3D> 'properties.labels'
            ))
            ->add('language', 'text', a= rray(
                'required' =3D&g= t; false,
                'property_path' = =3D> 'properties.language'
            ))
It could be improved this way:
        $builder
            ->add('title', :text, array= (
                :label =3D> '= Album title'
            ))
            ->add('title_alias', :text,= array(
                :label =3D> '= Album alias',
                :required =3D>= ; false,
                :property_path =3D= > 'properties.titleAlias'
            ))
            ->add('comment', :text, arr= ay(
                :label =3D> '= Comment',
                :required =3D>= ; false,
                :property_path =3D= > 'properties.comment'
            ))
            ->add('labels', :text, arra= y(
                :label =3D> '= Musical labels',
                :required =3D>= ; false,
                :property_path =3D= > 'properties.labels'
            ))
            ->add('language', :text, ar= ray(
                :required =3D>= ; false,
                :property_path =3D= > 'properties.language'
            ))
2. Memory usage reduction. AFAIK, there's a lot of cases like the above = and the use of symbols would affect on memory usage significantly.
3. Memory leaks. Symbols can't be just garbage collected as, for example= zvals, it's not possible. But we can do it for every request, so I don'= t think it would be problem.
4. Autocompletion from IDEs.

Hi Nikita!

I don't quite understand what those symb= ols would actually be good for. If it's just for saving exactly one char= acter for array keys (:foo vs "foo"), then this isn't worth it. If this = is about memory savings, then I don't think it will help at all. PHP use= s interned strings, so all those "label" etc strings in your above examp= le actually use the same string value. The hash for those strings is als= o precomputed, so symbol don't have a performance advantage either. Rega= rding your fourth point, autocompletion is available for string array ke= ys at least in PhpStorm and I guess also in other IDEs.

So, I don't yet really get what the= point behind the symbols is.

Th= anks,
Nikita

Hi, yes, you are right about interned s= trings, I didn't know it.
Actually my personal opinion is that= strings should be used to store data (as values), not to retrieve it (n= ot as keys). But strings are more developer-friendly than anything else = (because you don't need to define new constant or new enumerable member = for adding new parameter on receiving side, and you always see what this= parameter is about). So you can see Symbols as enumerable that don't ne= ed to be initialized. There's actually no technical reason behind that.<= /div>

Though there would be a little speed-up because= with Symbols array's Buckets will keep numeric key, so instead of memcm= p you will need to just compare two longs when retrieving element.
=
Actually this is looks a little bad now, AFAIK this is what happens= when you trying to receive value from array by string key:
    Calling zend_new_interned_string_int for interning or getting already i= nterned same string, receiving pointer to the stored string from it:
    1. Hash the string (O(n))
    2. Retrieve bucket from arBuckets
    3. Find needed bucket by iterating over all retrieved buckets (over = *pLast) and comparing its keys with memcmp
    4. If found - return poi= nter to string, else create new bucket...
  1. Now that we have = an interned string, we can try to retrieve value from array with string:=
    1. Hash the string again (O(n))
    2. Retrieve bucket by has= h fro arBuckets
    3. And again memcmp used for comparing strings
    4. =

This could be improved with Symbols, so th= at you won't need hash string twice and use memcmp.

=
BTW do we really need a doubly linked list for interned strings (pL= istLast, pListNext) and all the extra members from HashTable/Bucket that= needed for arrays? I know this is offtopic and these structs are used e= verywhere in PHP (because of DRY), but there are some places like this o= r class tables where we don't need arrays (PHP's arrays) functionality.<= /div> ------------ccAp9mj749lWeYWrgAmeyg-- ------------ccAp9mj749lWeYY6ysj4bw--