Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:79135 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 821 invoked from network); 24 Nov 2014 22:21:41 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 24 Nov 2014 22:21:41 -0000 Authentication-Results: pb1.pair.com header.from=php@golemon.com; sender-id=softfail Authentication-Results: pb1.pair.com smtp.mail=php@golemon.com; spf=softfail; sender-id=softfail Received-SPF: softfail (pb1.pair.com: domain golemon.com does not designate 209.85.192.177 as permitted sender) X-PHP-List-Original-Sender: php@golemon.com X-Host-Fingerprint: 209.85.192.177 mail-pd0-f177.google.com Received: from [209.85.192.177] ([209.85.192.177:58362] helo=mail-pd0-f177.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id E2/12-21335-47FA3745 for ; Mon, 24 Nov 2014 17:21:41 -0500 Received: by mail-pd0-f177.google.com with SMTP id ft15so10514522pdb.22 for ; Mon, 24 Nov 2014 14:21:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=H4UtR2VgCV+Cu/Z7vOmmcD+VzvHire+JDwFCxWZ2caI=; b=LImqLEnq0/pNSp1kiCENweos7kFas7aMaB4MgXwpSzX2G7I2AjU3BqAQjTBGxBY7Ig 78v5dkVTb6wA37yeh3323EhgTexmwRROsmBAAEaMZqDjfo17aEFmxnp1cXboqUI99pcf KwZJ8XYN0bFg4gc6uphVI1kHX3/tgwyMvca3XcrJ7SZRUS9xWNismADTG8MflSEiRjv9 763O6y9Op0pbFXG7bCTABsBHjjo5M+JOtGYfQ+Ydsxf1Y1c7wHl0OyzuFoAXrxcyHA9B n6PNzbUMnTgRakFc0yPB1Jf1l6Cd8x7kEi/wt5FpJxQln1L5aKb7yaBpMtWLIrT4uC8u L87Q== X-Gm-Message-State: ALoCoQm8BRydINWwmRtEGsCiAt/mvHHGu4zDGaMGHoClShOmf7EUfi4dyPosbtHvSXzWAklml9Gi MIME-Version: 1.0 X-Received: by 10.69.31.138 with SMTP id km10mr37705814pbd.6.1416867697664; Mon, 24 Nov 2014 14:21:37 -0800 (PST) Sender: php@golemon.com Received: by 10.70.127.139 with HTTP; Mon, 24 Nov 2014 14:21:37 -0800 (PST) X-Originating-IP: [2620:10d:c082:1003:22c9:d0ff:fe87:295b] In-Reply-To: References: Date: Mon, 24 Nov 2014 14:21:37 -0800 X-Google-Sender-Auth: C1YsiMR4tSd2dl4DBxiqIw-kUeU Message-ID: To: Andrea Faulds Cc: PHP Internals Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] [RFC] Unicode Escape Syntax From: pollita@php.net (Sara Golemon) On Mon, Nov 24, 2014 at 2:09 PM, Andrea Faulds wrote: > Here=E2=80=99s a new RFC: https://wiki.php.net/rfc/unicode_escape > I'm okay with producing UTF-8 even though our strings are technically binary. As you state, UTF-8 is the de-facto encoding, and recognizing this is pretty reasonable. You may want to make it a requirement that strings containing \u escapes are denoted as: u"blah blah" We set aside this format back in the PHP6 days (note that b"blah" is equivalent to "blah" for binary strings). On the BMP versus SMP issue of \uXXXX styles, we addressed this in PHP6 by making \u denote 4 hexit BMP codepoints, while \U denoted six hexit codepoints. e.g. "\u1234" =3D=3D=3D "\U001234" I'd rather follow this style than making \u special and different from hex and octal notations by using braces. -Sara