Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:37441 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 45232 invoked from network); 4 May 2008 19:16:11 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 4 May 2008 19:16:11 -0000 Authentication-Results: pb1.pair.com header.from=tokul@users.sourceforge.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=tokul@users.sourceforge.net; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain users.sourceforge.net from 213.197.162.99 cause and error) X-PHP-List-Original-Sender: tokul@users.sourceforge.net X-Host-Fingerprint: 213.197.162.99 avilys.eik.lt Linux 2.6 Received: from [213.197.162.99] ([213.197.162.99:48466] helo=avilys.eik.lt) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E8/40-43080-87B0E184 for ; Sun, 04 May 2008 15:16:10 -0400 Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id 384241F5174 for ; Sun, 4 May 2008 22:15:11 +0300 (EEST) Received: from avilys.eik.lt (avilys.local [127.0.0.1]) by avilys.eik.lt (Postfix) with ESMTP id 1C5271F5167 for ; Sun, 4 May 2008 22:15:11 +0300 (EEST) Received: from 78.61.224.253 (NaSMail authenticated user tomas@topolis.lt) by avilys.eik.lt with HTTP; Sun, 4 May 2008 22:15:11 +0300 (EEST) Message-ID: <60526.78.61.224.253.1209928511.nsm@avilys.eik.lt> In-Reply-To: References: <4BD5A050-02F2-46BD-B867-FA8CA12FF1BD@macvicar.net> <48988.78.61.224.253.1209918881.nsm@avilys.eik.lt> Date: Sun, 4 May 2008 22:15:11 +0300 (EEST) To: internals@lists.php.net User-Agent: NaSMail/1.4 MIME-Version: 1.0 Content-Type: text/plain;charset=utf-8 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal X-Virus-Scanned: ClamAV using ClamSMTP Subject: Re: [PHP-DEV] Removal of unicode_semantics From: tokul@users.sourceforge.net ("Tomas Kuliavas") >> > We've discussed this a few times in the past and it's time to make a >> > final decision about its removal. >> > >> > I think most people have agreed that this is the way forward but no >> > one has produced a patch. I have a student working on unicode >> > conversion for the Google Summer of Code and this would help make it >> > simpler. >> >> unicode_semantics=on breaks backwards compatibility in scripts that have >> implemented multiple character set support in current PHP setups. > > Why don't you go ahead and make a list of those exacty issues then? We > can then see how to fix those issues. That's much more useful then just > posting to the mailinglist when you don't agree with something. From > what I've seen with my code base, the changes that I have to do are > minimal once some (internal) functions are fixed up. If I remain silent, others will have arguments that "everybody agrees on removal of unicode_semantics". I write and maintain charset decoding and encoding functions. unicode_semantics breaks every mapping table and other functions that operate with binary 8bit strings. In slides by Andrei Zmievski Unicode symbols are written with \u. Why are they written with \x(hex) and \(octal) in current PHP6? ---