Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:73162 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 71859 invoked from network); 14 Mar 2014 11:20:05 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 14 Mar 2014 11:20:05 -0000 Authentication-Results: pb1.pair.com smtp.mail=nikita.ppv@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=nikita.ppv@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.214.173 as permitted sender) X-PHP-List-Original-Sender: nikita.ppv@gmail.com X-Host-Fingerprint: 209.85.214.173 mail-ob0-f173.google.com Received: from [209.85.214.173] ([209.85.214.173:41758] helo=mail-ob0-f173.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 5D/B0-02580-5E5E2235 for ; Fri, 14 Mar 2014 06:20:05 -0500 Received: by mail-ob0-f173.google.com with SMTP id gq1so2385238obb.4 for ; Fri, 14 Mar 2014 04:20:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=DTRk+LMcZ9gs7u66Kgf4ftU4JlfGA29SpLjTwXhW2lM=; b=zUslzZLaJf4j8ExSi156acU8rf4CDVhOIcSvYh2XJYPBL27YoKl+3fnVJcFXrVoPVE 2jy+5ZPD4y7eL1XAurPq2li4vfZv0kYI0RTOTxlvoB18XdYRI+T3/jDB5sWFo8GE+I8d C7qMoXfkYfvRfvAvaAHMw1fdjwWfnXVE47JcZE31v0/WmC2/jC2BgBmZpzcz7kimjmSO KwfDRyJzANT+SrTJWkr4E9w/dn/isEVM2WM8W444m7Tjxf9K1Npp/W34+bm+Jf/GCOs6 tvQpRSSmZm2fECwiK5c7C6IY1+cgLlvJz9vTqxKJScg+MsV3am2GhhIGEfpRoxYsoDNT 60IQ== MIME-Version: 1.0 X-Received: by 10.60.51.230 with SMTP id n6mr6106805oeo.35.1394796002210; Fri, 14 Mar 2014 04:20:02 -0700 (PDT) Received: by 10.182.69.101 with HTTP; Fri, 14 Mar 2014 04:20:02 -0700 (PDT) In-Reply-To: <5322DE0E.5070101@lsces.co.uk> References: <530F0BF8.4040307@lsces.co.uk> <530F18C6.1000301@lsces.co.uk> <530F2264.10200@lsces.co.uk> <53219673.7070708@googlemail.com> <5322B46C.60005@googlemail.com> <5322DE0E.5070101@lsces.co.uk> Date: Fri, 14 Mar 2014 12:20:02 +0100 Message-ID: To: Lester Caine Cc: PHP internals Content-Type: multipart/alternative; boundary=001a11c30c7c3ef9d404f48f404f Subject: Re: [PHP-DEV] Re: [php6] Unicode support, options? From: nikita.ppv@gmail.com (Nikita Popov) --001a11c30c7c3ef9d404f48f404f Content-Type: text/plain; charset=ISO-8859-1 On Fri, Mar 14, 2014 at 11:46 AM, Lester Caine wrote: > Yasuo Ohgaki wrote: > >> I've checked libmbfl AUTHORS in ext/mbstring. There are too many. >> Switching multibyte filter is easier, I'll use ICU for it. Then there is >> no >> obstacle building mbstring by default. >> > > Slight aside but relevant re. regular expressions library ... What is > wrong with the unicode mode of preg? I'd just been using it without even > thinking after moving over from ereg. > Nothing is wrong with it, PCRE has very good support for UTF-8 (including character properties and extended grapheme clusters). Can we just deprecate mb_ereg? It seems totally useless and just confuses people. If you want to match regular expressions on non-UTF-8 just do a conversion beforehand (or use a sane encoding right away, you know). Nikita --001a11c30c7c3ef9d404f48f404f--