Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:73169 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 14589 invoked from network); 14 Mar 2014 22:12:04 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 14 Mar 2014 22:12:04 -0000 Authentication-Results: pb1.pair.com smtp.mail=yohgaki@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=yohgaki@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.217.176 as permitted sender) X-PHP-List-Original-Sender: yohgaki@gmail.com X-Host-Fingerprint: 209.85.217.176 mail-lb0-f176.google.com Received: from [209.85.217.176] ([209.85.217.176:55990] helo=mail-lb0-f176.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 27/50-12876-3BE73235 for ; Fri, 14 Mar 2014 17:12:04 -0500 Received: by mail-lb0-f176.google.com with SMTP id 10so2199027lbg.35 for ; Fri, 14 Mar 2014 15:12:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=agpXZDg4qR2j9JZU/IrugPYFFUG7Sx0LCHlP4J8Op6I=; b=NzDlLhk9iB4UC4qGJVvPmJsBEkvmavs3TZ0m+vFAOLV6+ZO7XPjIg9yIvfV9gd3Zeg r/vYiQK0CiNliLyeFvb9DOK5M5QLC4KJt0mwxNLQ11Fyi9OO2ebkukgtRif25o7s8xz+ j7WeXg8zBjd8gqsw3TSfMD3VL4T40ztZ2hS8gpbYUyc3Qwks1jZ3usScs49rgECEcfbx FDbYO6PsC/+g/yiV7bJGUbhb1g6nTFj4EvBwVJ/sGAeXWGuJ49k9EYAAnnFBNjCHmfxd 90cyLWDzmjnMvpGOecQQEgGzSBC75MbPaZk0fq6taBzXKKtv5sUT1g9XFV+Vp6BL/2AY YHng== X-Received: by 10.152.1.8 with SMTP id 8mr7110697lai.1.1394835120653; Fri, 14 Mar 2014 15:12:00 -0700 (PDT) MIME-Version: 1.0 Sender: yohgaki@gmail.com Received: by 10.112.205.73 with HTTP; Fri, 14 Mar 2014 15:11:20 -0700 (PDT) In-Reply-To: References: <530F0BF8.4040307@lsces.co.uk> <530F18C6.1000301@lsces.co.uk> <530F2264.10200@lsces.co.uk> <53219673.7070708@googlemail.com> <5322B46C.60005@googlemail.com> <5322DE0E.5070101@lsces.co.uk> Date: Sat, 15 Mar 2014 07:11:20 +0900 X-Google-Sender-Auth: wUjSPOvoVe8NTwk4Ykf5wv2BZik Message-ID: To: Alexey Zakhlestin Cc: Nikita Popov , Lester Caine , PHP Internals List Content-Type: multipart/alternative; boundary=089e01228132e30f8d04f4985bcd Subject: Re: [PHP-DEV] Re: [php6] Unicode support, options? From: yohgaki@ohgaki.net (Yasuo Ohgaki) --089e01228132e30f8d04f4985bcd Content-Type: text/plain; charset=UTF-8 Hi all, On Fri, Mar 14, 2014 at 8:33 PM, Alexey Zakhlestin wrote: > > Nothing is wrong with it, PCRE has very good support for UTF-8 (including > > character properties and extended grapheme clusters). Can we just > deprecate > > mb_ereg? It seems totally useless and just confuses people. If you want > to > > match regular expressions on non-UTF-8 just do a conversion beforehand > (or > > use a sane encoding right away, you know). > > Several years ago mb_ereg was slightly faster than pcre. It could have > changed since then Besides unneeded conversion is better to be avoided, we also should consider the case encoding is broken some how. Conversion should fail or replace broken bytes, but it changes original data. Regards, -- Yasuo Ohgaki yohgaki@ohgaki.net --089e01228132e30f8d04f4985bcd--