Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:53445 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 71744 invoked from network); 20 Jun 2011 18:38:32 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 20 Jun 2011 18:38:32 -0000 Authentication-Results: pb1.pair.com header.from=robert@xarg.org; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=robert@xarg.org; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain xarg.org from 209.85.220.170 cause and error) X-PHP-List-Original-Sender: robert@xarg.org X-Host-Fingerprint: 209.85.220.170 mail-vx0-f170.google.com Received: from [209.85.220.170] ([209.85.220.170:63972] helo=mail-vx0-f170.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id CC/51-64171-7A39FFD4 for ; Mon, 20 Jun 2011 14:38:32 -0400 Received: by vxi39 with SMTP id 39so1206873vxi.29 for ; Mon, 20 Jun 2011 11:38:29 -0700 (PDT) MIME-Version: 1.0 Received: by 10.220.99.83 with SMTP id t19mr946743vcn.204.1308595108761; Mon, 20 Jun 2011 11:38:28 -0700 (PDT) Received: by 10.220.100.146 with HTTP; Mon, 20 Jun 2011 11:38:28 -0700 (PDT) X-Originating-IP: [92.225.216.141] In-Reply-To: <1308591260.8394.47.camel@inspiron> References: <1308584208.6296.9.camel@guybrush> <1308586150.6296.13.camel@guybrush> <1308589044.8394.27.camel@inspiron> <4DFF7E2A.50506@sugarcrm.com> <1308591260.8394.47.camel@inspiron> Date: Mon, 20 Jun 2011 20:38:28 +0200 Message-ID: To: Todd Ruth Cc: Stas Malyshev , "internals@lists.php.net" Content-Type: multipart/alternative; boundary=0016e646a3c29cd2f904a6290aea Subject: Re: [PHP-DEV] Re: foreach() for strings From: robert@xarg.org (Robert Eisele) --0016e646a3c29cd2f904a6290aea Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I really like the ideas shared here. It's a thing of consideration that array-functions should also work with strings. Maybe this would be the way to go, but I'm more excited about the OOP implementation of TextIterator an= d ByteIterator, which solves the whole problem at once (and is easier to implement, as mentioned by Stas). As Jonathan said, Database results with = a certain encoding could get iterated, too. The only way to workaround the Text/Byte problem would be, offsetting >EVERY< string with 1-2 byte "string-type" information or an additional type flag in the zval-strcuture. Handling everything with zval's instead of objects would have the advantage= , that database-layers like mysqlnd could write the database-encoding directl= y into the zval and the user had no need to decide what encoding is used. A new casting operator (binary) could then cast the string to a 1-byte array. But this is syntactical sugar over OOP-implementations - I don't kno= w which one is the better choice. For example: $utf8_string =3D "J=E4germeister"; // information of utf8 ist stored in the= zval foreach ($utf8_string as $k =3D> $v) // would iterate in byte mode foreach ((binary)$utf8_string as $k =3D> $v) // would iterate in text mode over this: $utf8_obj =3D new ByteIterator("J=E4germeister"); foreach ($utf8_obj as $k =3D> $v) foreach ($utf8_obj->toText() as $k =3D> $v) I think the first one is easier and would be nicer to average developers (and lazy programmers like me ;o) ) Todd, I don't like neither str_split() nor text_string_to_array(). Sure, str_split could be optimized to return a different more optimized result inside of foreach() but I would use rather one of the implementations, mentioned above. 2011/6/20 Todd Ruth > On Mon, 2011-06-20 at 10:06 -0700, Stas Malyshev wrote: > > Hi! > > > > On 6/20/11 9:57 AM, Todd Ruth wrote: > > > Iterators are nice. Having a "text_string_to_array" function > > > would also be fine. For example: > > > > > > $s =3D 'hello'; > > > foreach (text_string_to_array($s) as $x) { > > > var_dump($x); > > > } > > > > > > text_to_array($s) =3D=3D str_split($s, 1) > > Does that have approximately the same performance as marking the > string as being OK to use as an array? For example, > > $s =3D file_get_contents($big_file); > foreach (str_split($s, 1) as $x) { > f($x); > } > > Are there performance issues with the above compared to: > > $s =3D file_get_contents($big_file); > foreach (text_string_to_array($s) as $x) { > f($x); > } > > assuming text_string_to_array could be implemented as marking > the string OK to use as an array. > > Again, I don't know enough about the internals. I'm just imagining > a significant difference for very long strings between: > $a1 =3D text_to_array('hello'); > and > $a2 =3D array('h','e','l','l','o'); > > $a1 and $a2 could act identically until a set occurred. For example, > "$a1['key'] =3D 5;" would first trigger $a1 becoming just like $a2 so > that the set could take place. > > Any string that has not been hit with text_string_to_array would lead > to all the usual error messages some of us know and love and > any string that has been hit with text_string_to_array would allow all > the fancy features some people are seeking. I'm trying to find a > way to please the people that want strings to act like arrays without > ruining the day for those of us who are glad strings don't act like > arrays. > > - Todd > > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > > --0016e646a3c29cd2f904a6290aea--