Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:71271 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 26457 invoked from network); 19 Jan 2014 16:09:44 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 19 Jan 2014 16:09:44 -0000 Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.215.44 as permitted sender) X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.215.44 mail-la0-f44.google.com Received: from [209.85.215.44] ([209.85.215.44:65001] helo=mail-la0-f44.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 55/E9-61840-7C8FBD25 for ; Sun, 19 Jan 2014 11:09:44 -0500 Received: by mail-la0-f44.google.com with SMTP id hm7so2629518lab.31 for ; Sun, 19 Jan 2014 08:09:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=AnLm9jKNVQYRh2qxVXhbRmNVhbqUGG6pb/jlm34BIh4=; b=04qytprU1b4/Y7xDUeQ9OJGhVmoOUItwnaQl+W5Vo68O+GuFt2YWV1OXhUauhQ/AXp W+0DmwRjw+uTtGY46snqR9Eiv16h85OfDwYy2YWWSBF+UgQV7luzSL9pxy++ec0knVVa gkYCBh1hZbIIeGppuC59yipm5OQCK+sZJDRAkn1wFER9voI+Xo/MhNfLKXvMlEYM51sX fRNgPuzdnUIwKzv6OwqdhNnt/yOX5oJaib6dB6YCJNOtf/2qFQu1Wy7rBbih07lNZZ2v Zj236g2QICNieuwl2RJ5p17TDUV0C3YTX5D01JV4zEt4ohO0MBXEIYUb1UBXlsNJLZue XOjQ== MIME-Version: 1.0 X-Received: by 10.152.44.225 with SMTP id h1mr8777377lam.22.1390147780370; Sun, 19 Jan 2014 08:09:40 -0800 (PST) Received: by 10.112.35.134 with HTTP; Sun, 19 Jan 2014 08:09:40 -0800 (PST) In-Reply-To: References: Date: Sun, 19 Jan 2014 17:09:40 +0100 Message-ID: To: Yasuo Ohgaki Cc: Nikita Popov , "internals@lists.php.net" Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] [RFC] Multibyte char handling From: pierre.php@gmail.com (Pierre Joye) On Thu, Jan 16, 2014 at 11:47 PM, Yasuo Ohgaki wrote: > Hi Nikita, > > On Fri, Jan 17, 2014 at 7:38 AM, Nikita Popov wrote: > >> No, I don't want a locale-based approach. I want the string functions to >> stay as is. Multibyte variants of the functions can be added to the >> multibyte extension. > > > Creating mb_*() function would not solve security issues of > multibyte char handling since multibyte aware functions are > optional feature. We never supported nor claimed that these functions are multi bytes safe. However I actually fully understand that we should solve this problem, in one way or another. > However, it may work if PHP compiles mbstring by default and > discourage use of addslashes()/var_export()/stripslashes() > in favor of mb_*() variants. I do not think we should discourage the use of these functions but clearly document to rely on mb_* APIs as long as multi bytes support is required. I join other about not making any optional arguments in the existing APIs, for a couple of reasons: 1. it does not solve anything as people still have to update their code, and they won't unless maybe if they read the doc/changelog 2. It is really not a clean solution 3. we already have many duplicate functions in mb, it has worked well so far and we can add the ones discussed here The last question was about relying on locale. This is absolutely not a solution. Locale has been proven to be totally unreliable, buggy and unsafe. Let alone the total lack of real posix locale support on Windows. For anything related to locale, formats or encoding, we should rely on intl (ICU) and not on systems's locale. This is the only way to be portable, safe and updated. Cheers, Pierre