Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:29558 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 40414 invoked by uid 1010); 19 May 2007 14:26:27 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 40395 invoked from network); 19 May 2007 14:26:27 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 May 2007 14:26:27 -0000 Authentication-Results: pb1.pair.com header.from=antony@zend.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=antony@zend.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain zend.com designates 212.25.124.162 as permitted sender) X-PHP-List-Original-Sender: antony@zend.com X-Host-Fingerprint: 212.25.124.162 mail.zend.com Linux 2.5 (sometimes 2.4) (4) Received: from [212.25.124.162] ([212.25.124.162:32996] helo=mail.zend.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id A5/41-00717-F090F464 for ; Sat, 19 May 2007 10:26:26 -0400 Received: (qmail 28247 invoked from network); 19 May 2007 14:26:20 -0000 Received: from internal.zend.office (HELO ?127.0.0.1?) (10.1.1.1) by internal.zend.office with SMTP; 19 May 2007 14:26:20 -0000 Message-ID: <464F090A.9090200@zend.com> Date: Sat, 19 May 2007 18:26:18 +0400 User-Agent: Thunderbird 2.0.0.0 (X11/20070326) MIME-Version: 1.0 To: Tomas Kuliavas CC: internals@lists.php.net References: <51491.88.118.163.159.1179577357.squirrel@avilys.eik.lt> <464EEF4B.1030002@zend.com> <40865.88.118.163.159.1179583186.squirrel@avilys.eik.lt> In-Reply-To: <40865.88.118.163.159.1179583186.squirrel@avilys.eik.lt> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] PHP Unicode extension in PHP6 From: antony@zend.com (Antony Dovgal) On 19.05.2007 17:59, Tomas Kuliavas wrote: > SquirrelMail scripts are designed to work with binary strings. They will > have to deal with emails written in many different character sets. In some > cases scripts must know string length in bytes and not in symbols. If PHP > starts converting email body or message parts, strings won't match > information stored in email headers. Try this, you'll see it's really easy: "; var_dump(strlen(($s))); var_dump(strlen((binary)$s)); ?> > If unicode.semantics are turned on, PHP6-dev breaks one time pad creation > and randomizing of mt_rand. crc32, base64_encode and fputs notices and > warnings. "function expects parameter 1 to be strictly a binary string, > Unicode string given". "x character unicode buffer downcoded for binary > stream runtime_encoding". I might provide sample code, if I find the way > to reduce existing code to something simple. Currently I am trying to > understand what exactly is broken in SquirrelMail functions. I don't think there is something broken. You just pass unicode string to the functions which expected only binary ones. > I can fix these issues, but one day PHP might add similar checks to > str_replace(), array functions and pcre extension. Then it will break > character set conversion functions and any other code that operates with > 8bit strings. Currently I am stuck on broken authentication and can't > check if other parts of interface are already broken. > > I could not find the way to disable unicode.semantics in the script. Sure, that's not possible. > PHP_INI_PERDIR is not an option for scripts that are designed to be > portable. In some cases end user can't use .htaccess and can't control > php.ini or httpd.conf. mbstring function overloading effects can be > disabled. The way to turn off unicode.semantics is not documented. If > mbstring.func_overload is turned on, I can't trust string functions. Same > thing happens when unicode.semantics are turned on. -- Wbr, Antony Dovgal