Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:53529 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 13882 invoked from network); 23 Jun 2011 09:40:18 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Jun 2011 09:40:18 -0000 Authentication-Results: pb1.pair.com smtp.mail=jan@horde.org; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=jan@horde.org; sender-id=pass Received-SPF: pass (pb1.pair.com: domain horde.org designates 213.83.39.131 as permitted sender) X-PHP-List-Original-Sender: jan@horde.org X-Host-Fingerprint: 213.83.39.131 mail.ammma.de Linux 2.4/2.6 Received: from [213.83.39.131] ([213.83.39.131:17518] helo=ammma.de) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 11/0A-43757-00A030E4 for ; Thu, 23 Jun 2011 05:40:17 -0400 Received: from mail.ammma.net (ns.ammma.mil [192.168.110.1]) by ammma.de (8.11.6/8.11.6/AMMMa AG) with ESMTP id p5N9Qmo31815 for ; Thu, 23 Jun 2011 11:26:48 +0200 Received: from neo.wg.de (hydra.ammma.mil [192.168.110.1]) by mail.ammma.net (Postfix) with ESMTP id 8A0064D1C06C for ; Thu, 23 Jun 2011 11:40:13 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by neo.wg.de (Postfix) with ESMTP id 3CCBDDC210D for ; Thu, 23 Jun 2011 11:40:13 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at wg.de Received: from neo.wg.de ([127.0.0.1]) by localhost (neo.wg.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Oj6KZ97u9nFG for ; Thu, 23 Jun 2011 11:40:12 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by neo.wg.de (Postfix) with ESMTP id 3A482DC20D7 for ; Thu, 23 Jun 2011 11:40:12 +0200 (CEST) Received: from 192.168.60.116 ([192.168.60.116]) by neo.wg.de (Horde Framework) with HTTP; Thu, 23 Jun 2011 11:40:12 +0200 Date: Thu, 23 Jun 2011 11:40:12 +0200 Message-ID: <20110623114012.Horde.RFhDBxPcj3hOAwn8LIYjSMA@neo.wg.de> To: internals@lists.php.net References: <4E02D95F.7030303@garfieldtech.com> In-Reply-To: <4E02D95F.7030303@garfieldtech.com> User-Agent: Internet Messaging Program (IMP) H4 (5.0.8-git) Content-Type: text/plain; charset=ISO-8859-1; format=flowed; DelSp=Yes MIME-Version: 1.0 Content-Disposition: inline Subject: Re: [PHP-DEV] foreach() for strings From: jan@horde.org (Jan Schneider) Zitat von Larry Garfield : > On 06/20/2011 10:25 AM, John Crenshaw wrote: >> Doing this with an explicit iterator object is a fine idea. The >> syntax becomes something like: >> foreach(new TextIterator($s, 'UTF8') as $pos=>$c) >> { >> ... >> } >> >> On the other hand, I think that trying to support iteration without >> using an iterator object to mediate would be a disaster, and I'm >> opposed to doing something like that because: >> 1. The code just looks wrong. PHP developers are generally >> insulated from the char-arrayness of strings. In addition, since >> PHP isn't typesafe, the code becomes highly ambiguous. Is the code >> iterating an array, or a string? It is very hard to tell just by >> looking. It may be convenient to write, but it's certainly not >> convenient to read or maintain later. On the other hand, with a >> mediating iterator object, the intent becomes obvious, and the code >> is highly readable. >> 2. The odds of iterating any given string are slim at best. >> Supporting current, key, next, etc. would require the string object >> internally to get bloated with additional unnecessary data that is >> almost never used. This bloat isn't a single int either. For >> optimal performance it would need to consist of no less than two >> size_t (char position and binary position), and one encoding >> indicator. >> 3. Iteration cannot work without knowing which encoding to use for >> the string. Is it UTF8? UTF16? UTF7? Binary or some single byte >> encoding? Some other exotic wide encoding? Without an iterator >> object in the middle, there is no way to specify this encoding. >> Always treating this as binary would also be a mistake, since this >> is almost certainly never actually the correct behavior, even >> though it may often appear to behave correctly with simple inputs. >> 4. I've had simple mistakes caught numerous times when foreach >> complains about getting a scalar rather than an array. So far, it >> has been exactly right every time. Allowing strings to be iterated >> would, in the name of convenience, increase the probability of >> stupid mistakes evading detection. Even worse, the code itself >> would look logically correct until the developer finally realizes >> that they have a string and not an array. Errors like this are >> probably far more common in most projects than the need to iterate >> a string, so making this change hurts debugging in the common case, >> for the sake of syntactic sugar in the rare case. Not a good trade. >> >> John Crenshaw >> Priacta, Inc. > > I would echo John's statements here. foreach() directly iterating a > string is going to make my life substantially harder. I work in > array-heavy systems, and "bad first argument for foreach()" is > already a hard enough error to track down. It means "somewhere, > somehow, you put a string where you meant to put an array. GLWT." > Adding automatic string iteration would take away even that error > message and leave me with no way to figure out why my code is > randomly misbehaving. Just looking at the code, I would have no way > of knowing that such a bug lurks within. That's the downside of a > weakly typed but still typed language. And if that very same string that's supposed to be an array is processed using the $var[$n] syntax nowadays is any different? It's not, you won't get an error message for that either, and it's the same amount of work to track this down. Granted, making PHP behaving the same in foreach gives you one more place to track down such errors, but making it easier to track down developer errors is not anything that should keep PHP from adding new features. Jan. -- Do you need professional PHP or Horde consulting? http://horde.org/consulting/