Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:53414 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 96295 invoked from network); 20 Jun 2011 15:26:10 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 20 Jun 2011 15:26:10 -0000 Authentication-Results: pb1.pair.com header.from=johncrenshaw@priacta.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=johncrenshaw@priacta.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain priacta.com designates 64.95.72.238 as permitted sender) X-PHP-List-Original-Sender: johncrenshaw@priacta.com X-Host-Fingerprint: 64.95.72.238 mx1.myoutlookonline.com Received: from [64.95.72.238] ([64.95.72.238:3510] helo=mx1.myoutlookonline.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id A7/05-34681-E866FFD4 for ; Mon, 20 Jun 2011 11:26:08 -0400 Received: from mxout.myoutlookonline.com (localhost [127.0.0.1]) by mx1.myoutlookonline.com (Postfix) with ESMTP id 0CB7A8BE7FA; Mon, 20 Jun 2011 11:26:03 -0400 (EDT) X-Virus-Scanned: by SpamTitan at mail.lan Received: from HUB012.mail.lan (unknown [10.110.2.1]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by mx1.myoutlookonline.com (Postfix) with ESMTPS id AC87B8BE63E; Mon, 20 Jun 2011 11:26:01 -0400 (EDT) Received: from MAILR001.mail.lan ([10.110.18.27]) by HUB012.mail.lan ([10.110.17.12]) with mapi; Mon, 20 Jun 2011 11:26:01 -0400 To: Lee davis , Robert Eisele CC: "internals@lists.php.net" Date: Mon, 20 Jun 2011 11:25:56 -0400 Thread-Topic: [PHP-DEV] foreach() for strings Thread-Index: AcwvS6WyjDy9exnTQc2va6KftWt7VQAChn/g Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: RE: [PHP-DEV] foreach() for strings From: johncrenshaw@priacta.com (John Crenshaw) > -----Original Message----- > From: Lee davis [mailto:leedavis81@gmail.com]=20 > Sent: Monday, June 20, 2011 9:12 AM > To: Robert Eisele > Cc: internals@lists.php.net > Subject: Re: [PHP-DEV] foreach() for strings >=20 > I think this would be quite a useful feature, and am In favor of it. > However, I think caution should be taken when shifting array utilities ou= t > of their remit and allowing them to manipulate / traverse other data type= s. > You may see the floodgates opening for more request to adapt array functi= ons > for other uses. >=20 > Say for instance.. >=20 > Could we also use current(), next() and key() for iteration of strings? >=20 > $string =3D 'string'; > while ($char =3D current($string)) > { > echo key($string) // Would output the offset position I assume 0,1,2 et= c?? > echo $char // outputs each letter of string > next($string); > } >=20 > Lee >=20 > On Mon, Jun 20, 2011 at 12:27 PM, Robert Eisele wrote: >=20 > > foreach() has many functions, looping over arrays, objects and implemen= ting > > the iterator interface. I think it's also quite intuitive to use foreac= h() > > for strings, too. > > > > If you want to implement a parser in PHP, you have to go the way with f= or + > > strlen + substr() or $x[$i] to address one character of the string. We > > could > > overdo the functionality of foreach() > > by implementing LVAL's, too, in order to access single bits but this is > > really uncommon, even if the way of thinking could be, that foreach() g= ives > > a single attribute of each value, no matter > > if it's a complex object with the iterator interface or a primitive. Wh= at > > do > > you think about this one? My point of view is, that foreach() is very > > useful, which was acknowledged by many ppl via the comments of my artic= le. > > > > I think, adding features like this persuades the one or the other PHP u= ser > > to upgrade to 5.4. > > > > Robert > > Doing this with an explicit iterator object is a fine idea. The syntax beco= mes something like: foreach(new TextIterator($s, 'UTF8') as $pos=3D>$c) { ... } On the other hand, I think that trying to support iteration without using a= n iterator object to mediate would be a disaster, and I'm opposed to doing = something like that because: 1. The code just looks wrong. PHP developers are generally insulated from t= he char-arrayness of strings. In addition, since PHP isn't typesafe, the co= de becomes highly ambiguous. Is the code iterating an array, or a string? I= t is very hard to tell just by looking. It may be convenient to write, but = it's certainly not convenient to read or maintain later. On the other hand,= with a mediating iterator object, the intent becomes obvious, and the code= is highly readable. 2. The odds of iterating any given string are slim at best. Supporting curr= ent, key, next, etc. would require the string object internally to get bloa= ted with additional unnecessary data that is almost never used. This bloat = isn't a single int either. For optimal performance it would need to consist= of no less than two size_t (char position and binary position), and one en= coding indicator. 3. Iteration cannot work without knowing which encoding to use for the stri= ng. Is it UTF8? UTF16? UTF7? Binary or some single byte encoding? Some othe= r exotic wide encoding? Without an iterator object in the middle, there is = no way to specify this encoding. Always treating this as binary would also = be a mistake, since this is almost certainly never actually the correct beh= avior, even though it may often appear to behave correctly with simple inpu= ts. 4. I've had simple mistakes caught numerous times when foreach complains ab= out getting a scalar rather than an array. So far, it has been exactly righ= t every time. Allowing strings to be iterated would, in the name of conveni= ence, increase the probability of stupid mistakes evading detection. Even w= orse, the code itself would look logically correct until the developer fina= lly realizes that they have a string and not an array. Errors like this are= probably far more common in most projects than the need to iterate a strin= g, so making this change hurts debugging in the common case, for the sake o= f syntactic sugar in the rare case. Not a good trade. John Crenshaw Priacta, Inc.