Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:79143 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 15903 invoked from network); 24 Nov 2014 23:36:35 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 Nov 2014 23:36:35 -0000 Authentication-Results: pb1.pair.com smtp.mail=ajf@ajf.me; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=ajf@ajf.me; sender-id=pass Received-SPF: pass (pb1.pair.com: domain ajf.me designates 198.187.29.245 as permitted sender) X-PHP-List-Original-Sender: ajf@ajf.me X-Host-Fingerprint: 198.187.29.245 imap11-3.ox.privateemail.com Received: from [198.187.29.245] ([198.187.29.245:54533] helo=imap11-3.ox.privateemail.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 1D/05-21335-201C3745 for ; Mon, 24 Nov 2014 18:36:34 -0500 Received: from localhost (localhost [127.0.0.1]) by mail.privateemail.com (Postfix) with ESMTP id CD1B18800E5; Mon, 24 Nov 2014 18:36:31 -0500 (EST) X-Virus-Scanned: Debian amavisd-new at imap11.ox.privateemail.com Received: from mail.privateemail.com ([127.0.0.1]) by localhost (imap11.ox.privateemail.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id FsSMv3npEJyO; Mon, 24 Nov 2014 18:36:31 -0500 (EST) Received: from oa-res-27-210.wireless.abdn.ac.uk (oa-res-27-210.wireless.abdn.ac.uk [137.50.27.210]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.privateemail.com (Postfix) with ESMTPSA id 7BF178800D5; Mon, 24 Nov 2014 18:36:30 -0500 (EST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.1 \(1993\)) In-Reply-To: <20141124232911.GB6315@phcomp.co.uk> Date: Mon, 24 Nov 2014 23:36:28 +0000 Cc: internals@lists.php.net Content-Transfer-Encoding: quoted-printable Message-ID: <7D59D9FF-F8ED-4015-A34A-144A9367A589@ajf.me> References: <20141124232911.GB6315@phcomp.co.uk> To: Alain Williams X-Mailer: Apple Mail (2.1993) Subject: Re: [PHP-DEV] [RFC] Unicode Escape Syntax From: ajf@ajf.me (Andrea Faulds) > On 24 Nov 2014, at 23:29, Alain Williams wrote: >=20 > There is a big difference with \u or \U and \x or \o and that is the = number of > characters that follow the escape. \x has 2, \o has 3 - both are short = and easy > to count with the eye. \U012345 is quite long and it is not so = visually obvious > where it should end. >=20 > Ergo: I prefer Andrea's "\u{0123}" as it is going to be more robust = against typos. Typos are an angle I hadn=E2=80=99t quite considered, but yes, this = syntax is better against that. Importantly, it=E2=80=99s a compile error = if you produce a broken literal, while if you screwed up the brace-free = style you=E2=80=99d probably just get a mangled string. > One other thing that we could do is to allow code points to be named, = with \U > (capital 'U') eg: >=20 > echo "\U{arabic letter alef}\n=E2=80=9D; Ooh, that=E2=80=99s an interesting idea. I believe Perl actually has = this already, although it uses the \N syntax: http://perldoc.perl.org/perlreref.html#ESCAPE-SEQUENCES Is something like that what you have in mind? > If you think that it is a bad idea, please update the RFC to say why = this is a > bad idea and so why it is not going to happen - for now. >=20 > It would be nice since a code point is just a big number without any = really obvious > meaning, but a name makes for greater clarity. >=20 > However: I suspect that interpretting this might be considerably = slower which > means slower compilation. I=E2=80=99ll add it to the Future Scope part. One issue with this, however, is that we=E2=80=99d have to include a = Unicode info database from somewhere with the names of the characters. = That=E2=80=99d probably mean requiring ICU or something like it, which = the current patch doesn=E2=80=99t do. -- Andrea Faulds http://ajf.me/