Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:88646 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 59925 invoked from network); 2 Oct 2015 17:05:02 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 2 Oct 2015 17:05:02 -0000 Authentication-Results: pb1.pair.com smtp.mail=php@golemon.com; spf=softfail; sender-id=softfail Authentication-Results: pb1.pair.com header.from=php@golemon.com; sender-id=softfail Received-SPF: softfail (pb1.pair.com: domain golemon.com does not designate 209.85.217.174 as permitted sender) X-PHP-List-Original-Sender: php@golemon.com X-Host-Fingerprint: 209.85.217.174 mail-lb0-f174.google.com Received: from [209.85.217.174] ([209.85.217.174:33067] helo=mail-lb0-f174.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 8C/AB-23989-D39BE065 for ; Fri, 02 Oct 2015 13:05:02 -0400 Received: by lbos8 with SMTP id s8so29841935lbo.0 for ; Fri, 02 Oct 2015 10:04:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=HqQDLoCEF6uvSBuSPeuqZxYvKHJVH+TGrots2hLRXGY=; b=S75DRL5ftwkw/gOOUzUD2UENXaunbefyBmAVL39FqbhlgKsQ5KjKRZ67NHfLdUtroR efrUlzsVENxzvVhFSkrUiXzmppw5AWXfixSoqbGAmSj0rfAqjxi6F1upSEkVN1G92HSt yCf5C7twNj0Ib79l55GR9J+fX6rKq1cSb/iUQC/FBRRP/By7v+5dj/fIrfqntMCSLoPP TxNgRHuzNArSlOiuFL2p/sEAhGZWDs+I6ghLB5H75au9X/6M+BvvxcVGJx36DZlBL2JM Ec7q6NVv+j2fmZor4/KMADtFzTu42MMRawpHPhzPOiknAsydunR3rPP9yKqK8bUFUAd5 DKJw== X-Gm-Message-State: ALoCoQn8cvwXcZBPGkzKRi8kSM+ULcSurSLdS0V1yA+gCz55+ubgO59pV9OLmg0CGFeesLf5cDmq MIME-Version: 1.0 X-Received: by 10.112.129.202 with SMTP id ny10mr5844005lbb.112.1443805498065; Fri, 02 Oct 2015 10:04:58 -0700 (PDT) Sender: php@golemon.com Received: by 10.112.40.133 with HTTP; Fri, 2 Oct 2015 10:04:58 -0700 (PDT) X-Originating-IP: [2620:10d:c090:180::2375] In-Reply-To: References: Date: Fri, 2 Oct 2015 10:04:58 -0700 X-Google-Sender-Auth: jjcUW3xxRrdVkiBER8Pvw7WAiwI Message-ID: To: Bishop Bettini Cc: Peter Cowburn , PHP internals Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] Strings, invalid escape sequences and parse errors From: pollita@php.net (Sara Golemon) On Fri, Oct 2, 2015 at 6:53 AM, Bishop Bettini wrote: > On Fri, Oct 2, 2015 at 4:18 AM, Peter Cowburn > wrote: > >> a) change all other "invalid" escape sequences to be a parse error [that >> would mean "\m" would raise a parse error!] >> >> b) change \u{} to behave like any other escape sequence, by not raising a >> parse error and instead keeping the literal characters >> >> or c) tell me to keep quiet and accept the oddball behaviour, having quirks >> is The PHP Way after all. >> > > Well, I think option (a) would break parsed strings containing regex: > Oh holy hell. I was about to point towards A because I agree with Andrea that our invalid escape handling makes no sense, then you throw this wrench in the gears. While I still think that ignoring invalid sequences is bad and a recipe for disaster (for example, in a given regex string, you have some "escapes" passed to the engine as-is, while others like \t\v\f\r\n do get interpolated, which is so inconsistent and entirely php it's practically its own meme), I have to be practical about the fact that there is a TON of existing regex out there (and no small amount of "\u1234" sequences in JSON blobs). A ton of that existing regex is also needlessly using double-quotes strings where single-quotes would have worked, meaning we can't just bifurcate on that (even though allowing invalid sequences through on single-quotes makes some sense). Ugh... No, that's too big of a change to existing scripts. Can't do option A, much as I'd like. > Option (b) sounds reasonable, but there's probably A Solid Reason it was > implemented that way > AIUI, the "solid reason" was because it's dangerous to fail silently where you have high confidence that something is wrong. Again, I believe in it, but the arguments against option A illustrate why it might not be practical. I hate to say this, but in the interest of consistency (were 7.0 not in its final stage) I'd vote for B. > which if so leaves (c.ii): accepting the odd-ball behavior.... > Given that 7.0 is in its final stage, and changing this behaviour is probably a non-starter at this point. C seems the most sane^W pragmatic. It's not the first inconsistency PHP's picked up, it won't be the last. -Sara