Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:88636 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 21322 invoked from network); 2 Oct 2015 08:18:49 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 2 Oct 2015 08:18:49 -0000 Authentication-Results: pb1.pair.com header.from=petercowburn@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=petercowburn@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.212.177 as permitted sender) X-PHP-List-Original-Sender: petercowburn@gmail.com X-Host-Fingerprint: 209.85.212.177 mail-wi0-f177.google.com Received: from [209.85.212.177] ([209.85.212.177:36429] helo=mail-wi0-f177.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id A7/D6-23989-8ED3E065 for ; Fri, 02 Oct 2015 04:18:49 -0400 Received: by wicgb1 with SMTP id gb1so21871373wic.1 for ; Fri, 02 Oct 2015 01:18:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=lpb8qlIEAVv+gX5vr7FUNJdMJjoXaNEpPnJl0/oSIjU=; b=S58rUVPB2/WVxuqOsaLDoi70GdGIxunwY3Ohfgqj2sRsvPGfzXFExfXR9Ge77KdZau Ij3Qq6lgpp3T0QScfUph5w73clCJ0zsBr695Nzg3Iz/U+R3bH4/0IXdevyCIaDVQkycR +kkIgNXxG+Iqoc8QcEaf6Qi1EVSA9NWLV/89PB40Zdp/wdmYPJWv57g0KWD8LwMJWUWy kwx6A/Js+qbsgDjYog13SuZ2P61t6b1fylZ3DYAH+iVEfy/PI63ihFGJ24bcDD3zBV3F 7ZUZGoLQ3vPLRZkrdmEC8FaKJjkpgZja3qE2aeJgBXytkceYkLVmYp9iscxaIvEPUp2E c+8w== X-Received: by 10.194.75.169 with SMTP id d9mr14363763wjw.7.1443773926019; Fri, 02 Oct 2015 01:18:46 -0700 (PDT) MIME-Version: 1.0 Received: by 10.27.83.212 with HTTP; Fri, 2 Oct 2015 01:18:06 -0700 (PDT) Date: Fri, 2 Oct 2015 09:18:06 +0100 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary=047d7bb0433eff358b05211acf4d Subject: Strings, invalid escape sequences and parse errors From: petercowburn@gmail.com (Peter Cowburn) --047d7bb0433eff358b05211acf4d Content-Type: text/plain; charset=UTF-8 Happy Friday, internals! Prior to PHP 7, any "invalid" escape sequences within strings (as far as I can see) were ignored and the characters treated literally. For example: "\xGG" ("broken" hex sequence) gives "\xGG", "\99" ("broken" octal sequence) gives "\99", "\m" (not a recognised sequence at all) gives "\m" and so on. PHP 7 introduced a new escape sequence for unicode codepoints "\u{...}". This deliberately breaks away from the pack and raises a Parse Error when an escape sequence starting with "\u{" is not followed by the required characters to make it a "valid" escape sequence (i.e. 1 to 6 hex characters followed by a curly brace). Why does \u{} behave differently for any other escape sequence? Because the author prefers it that way,and indeed thinks all "invalid" escape sequences should result in the same error. [pers. comm.] The question I'd like to bring forward is: can we either: a) change all other "invalid" escape sequences to be a parse error [that would mean "\m" would raise a parse error!] b) change \u{} to behave like any other escape sequence, by not raising a parse error and instead keeping the literal characters or c) tell me to keep quiet and accept the oddball behaviour, having quirks is The PHP Way after all. Either way, I'd like to see some resolution to this sooner rather than later as we're very late in the PHP 7.0.0 game. Cheers, and enjoy your weekends, Peter --047d7bb0433eff358b05211acf4d--