Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:72719 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 31587 invoked from network); 20 Feb 2014 20:37:22 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 20 Feb 2014 20:37:22 -0000 Authentication-Results: pb1.pair.com header.from=keisial@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=keisial@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.43 as permitted sender) X-PHP-List-Original-Sender: keisial@gmail.com X-Host-Fingerprint: 74.125.82.43 mail-wg0-f43.google.com Received: from [74.125.82.43] ([74.125.82.43:43165] helo=mail-wg0-f43.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 44/00-31058-18766035 for ; Thu, 20 Feb 2014 15:37:22 -0500 Received: by mail-wg0-f43.google.com with SMTP id a1so1878177wgh.22 for ; Thu, 20 Feb 2014 12:37:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=PzmjAib+vAk8NIkwnWpfxMJMEbF+BdzHKGilY91ZKFE=; b=a+7nr/H6NgZOTYIIpEVRZ3oVNgl9uwSiEYkUwlGhrq+DqEE6WOE9AXRnV8MYJ8NHkS gi5iYqArZ9Pc+Y54euHZFNK7/JDNqPMmMSiYuqOFGQvvE1S5M6V0djQKar83ouYbKpyo hvZheoKmZg8bz8WFin8r0M6rB9WMXgH8BJrklxCaP8Q2UHSbMXvjvs48dnYLLe8k1dF4 ndqf+FQMkIAjER/QFaTkzwMgI54p5ExdNcoj+1HQE4K3nhvsZGkEdcdiN+ZRGXylCcNy nblSE9V/eUQ/+XsuPNwKA89RtqXc/gkplEsfxYoiVQBqQnbeQ1J7+48txW5mlW04Bu6M w08g== X-Received: by 10.180.205.204 with SMTP id li12mr211009wic.34.1392928637378; Thu, 20 Feb 2014 12:37:17 -0800 (PST) Received: from [192.168.1.27] (213.Red-79-146-210.dynamicIP.rima-tde.net. [79.146.210.213]) by mx.google.com with ESMTPSA id v6sm1531845wif.0.2014.02.20.12.37.15 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 20 Feb 2014 12:37:16 -0800 (PST) Message-ID: <5306677A.7070000@gmail.com> Date: Thu, 20 Feb 2014 21:37:14 +0100 User-Agent: Thunderbird MIME-Version: 1.0 To: Pierre Joye CC: Andrea Faulds , PHP internals References: <69AB9D0D-FF9E-42BE-B91C-827F7810B0F6@ajf.me> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] Simpler Unicode solution for PHP6 From: keisial@gmail.com (=?UTF-8?B?w4FuZ2VsIEdvbnrDoWxleg==?=) On 20/02/14 17:23, Pierre Joye wrote: > Hi, > > On Feb 20, 2014 11:17 PM, "Andrea Faulds" wrote: >> Hi, >> >> As a simpler to implement approach to Unicode, could we perhaps support > it just by adding an “is UTF-8” flag to strings internally? Then unmodified > functions would just see a normal string and handle it like they do any > other, and modified and new Unicode-aware functions would test for the > presence of the flag and handle the string appropriately in that case. >> Thoughts? > That could be an option during the development phase. However I do not like > the idea of a flag for the final implementation, it creates more troubles > from an application point of view. Optimizing space? The flag could be embedded with the type. Create a macro for checking if it's of type IS_STRING_RAW or IS_STRING_UTF8, and replace with it the hundreds of IS_STRING comparisons (it would affect many lines, but not hard).