Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:29556 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 27513 invoked by uid 1010); 19 May 2007 14:01:07 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 27498 invoked from network); 19 May 2007 14:01:07 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 May 2007 14:01:07 -0000 Authentication-Results: pb1.pair.com header.from=tokul@users.sourceforge.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=tokul@users.sourceforge.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain users.sourceforge.net from 213.197.162.99 cause and error) X-PHP-List-Original-Sender: tokul@users.sourceforge.net X-Host-Fingerprint: 213.197.162.99 avilys.eik.lt Linux 2.6 Received: from [213.197.162.99] ([213.197.162.99:58959] helo=avilys.eik.lt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id B7/3F-00717-1230F464 for ; Sat, 19 May 2007 10:01:07 -0400 Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id 1CB311F5148; Sat, 19 May 2007 16:59:46 +0300 (EEST) Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id 04C9F1F5147; Sat, 19 May 2007 16:59:46 +0300 (EEST) Received: from 88.118.163.159 (NaSMail authenticated user tomas@topolis.lt) by avilys.eik.lt with HTTP; Sat, 19 May 2007 16:59:46 +0300 (EEST) Message-ID: <40865.88.118.163.159.1179583186.squirrel@avilys.eik.lt> In-Reply-To: <464EEF4B.1030002@zend.com> References: <51491.88.118.163.159.1179577357.squirrel@avilys.eik.lt> <464EEF4B.1030002@zend.com> Date: Sat, 19 May 2007 16:59:46 +0300 (EEST) To: "Antony Dovgal" Cc: internals@lists.php.net User-Agent: NaSMail/1.0 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] PHP Unicode extension in PHP6 From: tokul@users.sourceforge.net ("Tomas Kuliavas") >> Hi, >> >> Could you make unicode.semantics configurable at PHP_INI_ALL level? > > No. > >> Or maybe PHP6 has string functions that are not unicode aware? > > All string functions are supposed to be able to work with both Unicode and > binary strings. > Unicode is just an addition, it doesn't mean that binary strings are not > supported anymore, even in Unicode mode. > But I don't really understand the reason for your question, care to > provide more details? SquirrelMail scripts are designed to work with binary strings. They will have to deal with emails written in many different character sets. In some cases scripts must know string length in bytes and not in symbols. If PHP starts converting email body or message parts, strings won't match information stored in email headers. If unicode.semantics are turned on, PHP6-dev breaks one time pad creation and randomizing of mt_rand. crc32, base64_encode and fputs notices and warnings. "function expects parameter 1 to be strictly a binary string, Unicode string given". "x character unicode buffer downcoded for binary stream runtime_encoding". I might provide sample code, if I find the way to reduce existing code to something simple. Currently I am trying to understand what exactly is broken in SquirrelMail functions. I can fix these issues, but one day PHP might add similar checks to str_replace(), array functions and pcre extension. Then it will break character set conversion functions and any other code that operates with 8bit strings. Currently I am stuck on broken authentication and can't check if other parts of interface are already broken. I could not find the way to disable unicode.semantics in the script. PHP_INI_PERDIR is not an option for scripts that are designed to be portable. In some cases end user can't use .htaccess and can't control php.ini or httpd.conf. mbstring function overloading effects can be disabled. The way to turn off unicode.semantics is not documented. If mbstring.func_overload is turned on, I can't trust string functions. Same thing happens when unicode.semantics are turned on. -- Tomas