Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:6430 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 65056 invoked by uid 1010); 13 Dec 2003 21:46:44 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 65003 invoked from network); 13 Dec 2003 21:46:43 -0000 Received: from unknown (HELO vckyb1.nw.wakwak.com) (211.9.230.144) by pb1.pair.com with SMTP; 13 Dec 2003 21:46:43 -0000 Received: from at.wakwak.com (at.wakwak.com [211.9.230.135]) by vckyb1.nw.wakwak.com (Postfix) with ESMTP id 3E4D83FF65; Sun, 14 Dec 2003 06:46:42 +0900 (JST) Received: from [192.168.0.130] (z152.218-225-128.ppp.wakwak.ne.jp [218.225.128.152]) by at.wakwak.com (8.12.10/8.12.10/2003-09-30) with ESMTP/inet id hBDLkgng099882; Sun, 14 Dec 2003 06:46:42 +0900 (JST) (envelope-from moriyoshi@at.wakwak.com) In-Reply-To: <200312131619.36182.ilia@prohost.org> References: <25BBBBC2-2CD2-11D8-8FCC-000A95CE0C62@at.wakwak.com> <200312131555.31407.ilia@prohost.org> <71356864-2DAE-11D8-89FE-000A95CE0C62@at.wakwak.com> <200312131619.36182.ilia@prohost.org> Mime-Version: 1.0 (Apple Message framework v606) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-ID: Content-Transfer-Encoding: 7bit Cc: PHP Internals Date: Sun, 14 Dec 2003 06:46:31 +0900 To: ilia@prohost.org X-Mailer: Apple Mail (2.606) Subject: Re: [PHP-DEV] Re: Regarding the latest patch on fgetcsv() (stable branch) From: moriyoshi@at.wakwak.com (Moriyoshi Koizumi) On 2003/12/14, at 6:19, Ilia Alshanetsky wrote: > On December 13, 2003 03:53 pm, Moriyoshi Koizumi wrote: >> Could a quarter be a minority? > > Unless the rules of mathematics had changed 25% is still a minority. > You also > forget that there are plenty of people who compile extensions and > never end > up using them. A similar fact applies to your assumption. It's also true that PHP can handle multibyte strings without mbstring or iconv in some cases where the users are just fortunate enough to not get in trouble, most likely because they just don't use such multibyte characters that are known to cause problems due to its structure. Those user question [1][2] exactly describes when it goes wrong. [1] http://marc.theaimsgroup.com/?l=php-dev&m=103828989330521&w=2 [2] http://news.php.net/article.php?group=php.i18n&article=633 > The critical point of this entire discussion is about NOT forcing > choices on > people who do not want/need them. There is no good reason to force > multibyte > version of fgetcsv() on every single user, when there are not one but > two PHP > extensions designed explicitly for multibyte support. On the other hand, the chances are very limited to users familiar to multibyte. First of all, flexibility on the configuration has been causing lots of confusion. I'd be happy if every existing application used mb_*() instead of their counterpart at approproate places, but it's unlikely. This is because we have two versions of string manipulation functions. And again, it's prevented users to write multibyte safe applications because multibyte-flavor extensions are currently not enabled by default though this fact is not my point. > If fgetcsv() in PHP 5 cannot be designed in such a way as to have no > significant performance penalties for non-multibyte strings the > function > should be introduced as mb_fgetcsv() or iconv_fgetcsv(). So, why not begin thinking of how it could be bearably fast even with multibyte support enabled? While I think the current stuff I made is the best portable and the fastest code, it's probable that there are a far better code. Moriyoshi