Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:90095 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 90851 invoked from network); 5 Jan 2016 14:55:31 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 5 Jan 2016 14:55:31 -0000 Authentication-Results: pb1.pair.com header.from=ekohler@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=ekohler@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 74.125.82.48 as permitted sender) X-PHP-List-Original-Sender: ekohler@gmail.com X-Host-Fingerprint: 74.125.82.48 mail-wm0-f48.google.com Received: from [74.125.82.48] ([74.125.82.48:38242] helo=mail-wm0-f48.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 04/32-12097-F59DB865 for ; Tue, 05 Jan 2016 09:55:28 -0500 Received: by mail-wm0-f48.google.com with SMTP id b14so32865185wmb.1 for ; Tue, 05 Jan 2016 06:55:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:date:message-id:subject:from:to:content-type; bh=1buh8g/bYz3lmuq7GzK7cUHKs3aDvNsvd0xxfHJJW5Q=; b=Okw3XyzPSo7WWTM39HDG1VDyq/c4fGyAWJYT6qhUf/vjXVdpZDoNJsrMPvoq446msW kkI112MwhYnDKtdQY1pyYinUxPh1qJyYQXQxh1TYUXCF6v3Dp+tAwHwZ4m704AwNx1wH H6U9Tlf7MhPjhFYtBx7AgzcefzSrY25lFgyOZCkd8cFH5Bos/FU72xHKTJPXcJvY/5Vq ClpCqvl0ijVgkWcELPVN+dR77OF8/ZAH/NgzVuPsNB9Y4Z9tMPYyN/XtiT8nl4SVjgZ0 5toPxtKAbsPyrrNr8GZNrX8Hxj6AAwdXVHufsiX7py5QxIVQLYeOX5YIoHqOqveRzQRR Tdpg== MIME-Version: 1.0 X-Received: by 10.195.11.129 with SMTP id ei1mr118674158wjd.80.1452005724838; Tue, 05 Jan 2016 06:55:24 -0800 (PST) Received: by 10.194.109.166 with HTTP; Tue, 5 Jan 2016 06:55:24 -0800 (PST) Reply-To: kohler@seas.harvard.edu Date: Tue, 5 Jan 2016 09:55:24 -0500 Message-ID: To: internals@lists.php.net Content-Type: text/plain; charset=UTF-8 Subject: Proposed change in json_encode+JSON_UNESCAPED_UNICODE behavior From: ekohler@gmail.com (Eddie Kohler) Hi, I'm proposing a small change in the behavior of `json_encode(str, JSON_UNESCAPED_UNICODE)` around the issue of line terminators. The U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR characters are allowed unescaped in JSON strings, but *not* allowed unescaped in Javascript. This is widely considered a minor wart in the JSON specification. As a result, the JSON_UNESCAPED_UNICODE flag is dangerous to use when generating HTML. For example, this will generate a Javascript error ("Unexpected token ILLEGAL") in the user's browser: ``` $x = mb_convert_encoding('
', 'UTF-8', 'HTML-ENTITIES'); echo ''; ``` The proposal is for `json_encode(..., JSON_UNESCAPED_UNICODE)` to escape the U+2028 and U+2029 characters as \u2028 and \u2029. A new flag, JSON_UNESCAPED_LINE_TERMINATORS, preserves the former behavior. It's important to note that this change *only* affects the non-default JSON_UNESCAPED_UNICODE flag. Jakub Zelenka approves of this change, which we've discussed on Github , but since it is a small change in behavior, he asked me to email internals in case anyone objects. Thanks all, Eddie Kohler