Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:31121 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 21317 invoked by uid 1010); 20 Jul 2007 00:11:15 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 21302 invoked from network); 20 Jul 2007 00:11:15 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 20 Jul 2007 00:11:15 -0000 Authentication-Results: pb1.pair.com smtp.mail=scott.mcnaught@synergy8.com; spf=softfail; sender-id=softfail Authentication-Results: pb1.pair.com header.from=scott.mcnaught@synergy8.com; sender-id=softfail Received-SPF: softfail (pb1.pair.com: domain synergy8.com does not designate 202.174.102.11 as permitted sender) X-PHP-List-Original-Sender: scott.mcnaught@synergy8.com X-Host-Fingerprint: 202.174.102.11 au01.synergy8.com Linux 2.6 Received: from [202.174.102.11] ([202.174.102.11:43868] helo=au01.synergy8.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 6C/E3-20012-0ADFF964 for ; Thu, 19 Jul 2007 20:11:14 -0400 Received: from [124.171.213.234] (helo=scottnote) by au01.synergy8.com with esmtpa (Exim 4.66) (envelope-from ) id 1IBg5K-00069m-R8 for internals@lists.php.net; Fri, 20 Jul 2007 10:11:06 +1000 Reply-To: To: References: <698DE66518E7CA45812BD18E807866CE648191@us-ex1.zend.net> <54C4340A-D9EA-4B5A-B39C-B55B29B1B3BC@prohost.org> <698DE66518E7CA45812BD18E807866CE648193@us-ex1.zend.net> <469B7FB1.1070507@pooteeweet.org> <698DE66518E7CA45812BD18E807866CE648290@us-ex1.zend.net> <469C6436.2060009@pooteeweet.org> <698DE66518E7CA45812BD18E807866CE6483DF@us-ex1.zend.net> <469CD717.2070607@pooteeweet.org> <469CE8CE.5080208@zend.com> <698DE66518E7CA45812BD18E807866CE648722@us-ex1.zend.net> <20A6448D-A497-4D95-AA9E-72D69421BB88@gravitonic.com> <469FE9F6.7050406@zend.com> <000001c7ca5a$92eef630$b8cce290$@mcnaught@synergy8.com> <1CF4261B-6282-4AFE-BE2D-DA70F80E2A07@gravitonic.com> In-Reply-To: <1CF4261B-6282-4AFE-BE2D-DA70F80E2A07@gravitonic.com> Date: Fri, 20 Jul 2007 10:11:06 +1000 Message-ID: <000101c7ca62$7907ab60$6b170220$@mcnaught@synergy8.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 12.0 Thread-Index: AcfKXaOqDc4vNRwFRH61WnvPYGjemwAAdhPw Content-Language: en-au X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - au01.synergy8.com X-AntiAbuse: Original Domain - lists.php.net X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - synergy8.com Subject: RE: [PHP-DEV] POSIX regex From: scott.mcnaught@synergy8.com I don't really know much about unicode, and to be honest, I don't really know much about the internal workings of php. But I assume that there are going to be different implementations of string functions depending on whether the string is unicode or not. I'm going to suggest an implementation suggestion... Keep in mind I havent hacked around with php source, so my variable naming etc will be wrong... and its all psuedocode, so its not // The object type used when php creates a string class ZendString { char *strPtr; // however strings are stored in php ZendStringFunctions *pFunctions; }; abstract class ZendStringFunctions { abstract function strtolower(ZendString *pStr); abstract function strtoupper(ZendString *pStr); abstract function substr(ZendString *pStr); // All functions that differ depending on unicode / non-unicode implementation // ... }; // A set of string functions for unicode strings class ZendStringFunctionsUnicode { function strtolower(ZendString *pStr) { // unicode implementation } function strtoupper(ZendString *pStr) { // unicode implementation } function substr(ZendString *pStr) { // unicode implementation } }; // A set of string functions for non-unicode strings class ZendStringFunctionsNonUnicode { function strtolower(ZendString *pStr) { // non-unicode implementation } function strtoupper(ZendString *pStr) { // non-unicode implementation } function substr(ZendString *pStr) { // non-unicode implementation } }; // the strtolower implmentation ZEND_FUNC strtolower(ZendString *pStr) { return pStr->pFunctions->strtolower(pStr); } // the strtoupper implmentation ZEND_FUNC strtolower(ZendString *pStr) { return pStr->pFunctions->strtolower(pStr); } ZEND_FUNC unicode_val(ZendString *pStr) { // do something with pStr->strPtr delete pStr->pFunctions; pStr->pFunctions = new ZendStringFunctionsUnicode(); } Anyway - the point I'm trying to make is to use function pointers to switch between implementations. You could even make the ZendStringFunctions singletons and just set pStr->pFunctions to an instance of the singleton. I think this would provide a very fast implementation of what is trying to be done. Im just making a suggestion, and feel free to ignore/criticise me if im wrong. I don't know anything about phps internals... Just an idea Scott -----Original Message----- From: Andrei Zmievski [mailto:andrei@gravitonic.com] Sent: Friday, 20 July 2007 9:36 AM To: scott.mcnaught@synergy8.com Cc: internals@lists.php.net Subject: Re: [PHP-DEV] POSIX regex On Jul 19, 2007, at 4:14 PM, wrote: > I don't like the idea of having a "u" prefix for Unicode strings. > It may > improve performance, and give you some level of fine grain control, > but... > > - It breaks your "keep php simple" policy by introducing a lot of new > functions (ugly). > - I (plus a lot of others) have an existing php5 application which > I wish to > eventually use with Unicode, and like others, I don't want to spend > time > refactoring. > - It will also introduce bugs when programmers accidentally forget > to add > the "u" prefix when working with unicode. > > If you always want to produce Unicode, I think its best to always > use a cast > or a conversion function. > > Eg > > $str = (unicode)(strtoupper($str)); > Or > $str = unicode_val(strtoupper($str)); Good idea and it will totally work, except that it won't. strtoupper () operates in different ways according to the type of the string that it gets. -Andrei -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php