Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:25536 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 74586 invoked by uid 1010); 6 Sep 2006 16:38:53 -0000 Delivered-To: ezmlm-scan-internals@lists.php.net Delivered-To: ezmlm-internals@lists.php.net Received: (qmail 74571 invoked from network); 6 Sep 2006 16:38:53 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 6 Sep 2006 16:38:53 -0000 Authentication-Results: pb1.pair.com header.from=andrei@gravitonic.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=andrei@gravitonic.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain gravitonic.com from 204.11.219.139 cause and error) X-PHP-List-Original-Sender: andrei@gravitonic.com X-Host-Fingerprint: 204.11.219.139 lerdorf.com Linux 2.5 (sometimes 2.4) (4) Received: from [204.11.219.139] ([204.11.219.139:57517] helo=lerdorf.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id CF/3B-26632-B99FEF44 for ; Wed, 06 Sep 2006 12:38:53 -0400 Received: from [66.228.175.145] (borndress-lm.corp.yahoo.com [66.228.175.145]) (authenticated bits=0) by lerdorf.com (8.13.7/8.13.7/Debian-1) with ESMTP id k86GckRB023004; Wed, 6 Sep 2006 09:38:47 -0700 Mime-Version: 1.0 (Apple Message framework v623) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-ID: <2b9a524b62303d2b7c3f5e20a7b86537@gravitonic.com> Content-Transfer-Encoding: 7bit Cc: Dmitry Stogov Date: Wed, 6 Sep 2006 09:39:57 -0700 To: PHP Internals X-Mailer: Apple Mail (2.623) Subject: RFC: unicode.semantics: runtime or not? From: andrei@gravitonic.com (Andrei Zmievski) We really need to settle on whether we want unicode.semantics to be changeable at runtime or not. During early development it was ZEND_INI_PERDIR, meaning that it could be changed in .htaccess and VirtualHost blocks. However, the infrastructure to support this flexibility was deemed too complicated at the Paris PDM. Basically, we need to maintain two sets of symbol tables and convert between them on the fly as well as two copies of each class entry. The latter was especially problematic instead of just mentioning class entry pointer, you had to access it like U_CLASS_ENTRY(ce). So it was decided that unicode.semantics switch would be only ZEND_INI_SYSTEM and that is how the development proceeded since then. However, there have come up concerns that keeping it this way will make PHP 6 adoption infeasible by the majority of hosting companies and users since they would have to run two copies of Apache to support both modes. We can go back to the PERDIR version, but that requires a lot of work and not just in the engine, but also in a lot of extensions. I will let Dmitry provide the technical details, but we need to decide which way to go: 1. ZEND_INI_SYSTEM and make people run two copies of Apache if they want both modes. This is architecturally more simple and more robust, I believe. 2. ZEND_INI_PERDIR and let people switch modes as described above. This is a lot of work and will probably result in quite a few edge cases where we used to rely on stability of one mode (such as APC or serialization, for example). -Andrei