Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:18159 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 36460 invoked by uid 1010); 16 Aug 2005 08:16:37 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 36442 invoked from network); 16 Aug 2005 08:16:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 16 Aug 2005 08:16:37 -0000 X-Host-Fingerprint: 195.28.69.139 mail.jobtion.com Linux 2.4 w/o timestamps Received: from ([195.28.69.139:36760] helo=sparky.datcon.sk) by pb1.pair.com (ecelerity 2.0 beta r(6323M)) with SMTP id 71/88-33075-2E0A1034 for ; Tue, 16 Aug 2005 04:16:35 -0400 Received: from localhost (localhost [127.0.0.1]) by sparky.datcon.sk (Postfix) with ESMTP id 0550C2263B for ; Tue, 16 Aug 2005 10:16:29 +0200 (CEST) Received: from sparky.datcon.sk ([127.0.0.1]) by localhost (sparky [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 29090-02-6 for ; Tue, 16 Aug 2005 10:16:28 +0200 (CEST) Received: from [172.16.71.217] (adsl90.212-5-195.telecom.sk [212.5.195.90]) by sparky.datcon.sk (Postfix) with ESMTP id C2EC022632 for ; Tue, 16 Aug 2005 10:16:22 +0200 (CEST) Message-ID: <4301A0D6.6000205@kmit.sk> Date: Tue, 16 Aug 2005 10:16:22 +0200 User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.11) Gecko/20050813 X-Accept-Language: sk, en MIME-Version: 1.0 To: internals References: <937066F0-AA5F-41E2-99A0-D74C7F44FFCA@gravitonic.com> In-Reply-To: <937066F0-AA5F-41E2-99A0-D74C7F44FFCA@gravitonic.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: amavisd-new at datcon.sk Subject: Re: [PHP-DEV] PHP Unicode support design document From: ondrej@kmit.sk (=?UTF-8?B?T25kcmVqIEl2YW5pxI0=?=) Andrei Zmievski wrote: > + Determining length of Unicode strings via strlen() function, some > simple string functions ported (substr). It's not a problem to determine kind of char in single byte character sets, but in the unicode with various encoding schemas I don't see easy way how to do it. It will be nice to have functions like this: isNumber(char), isAlphabetic(char), isWhitespace(char) ... It is on the plan or not? -- Ondrej Ivanic (ondrej@kmit.sk)