Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124487 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id E08691A00B7 for ; Thu, 18 Jul 2024 17:49:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1721325033; bh=JEIpyYKpOch+HfTlLtNHkXG+IYIBiRWo3AOs5Ncjapw=; h=From:Subject:Date:References:Cc:In-Reply-To:To:From; b=AtItkmcFMqeCHCxxkJ/LxgcrMUrTK46Zk078+iiRP7DMeylIr8EwQ48bvcK7/5nOW BqeKrmNd/Pa6su8XjhKbfYb2LTyafjNTuW+LNJSYeCo+BYB+Fa83+cXnh20Vn+uDmP U5Er2X0BbrKfwT3CaMz+6Hc+gileJUA5sQeqxkIdVDdv8YabUCPi+p/OlUgB3ueCYV JULKW09wRVUlwYtBxlygDOYaB5GTG9U/CJLitIvtZNNSzUTAtvVJejjdHfugkstt/E mPA7jH64IATK+Ulk03sE16Pgk1SmJnVLVJGnKsH9sALKVKZ7n9Dit2okws34UvIjY9 7vVi78lKVJaUw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B43D4180052 for ; Thu, 18 Jul 2024 17:50:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 18 Jul 2024 17:50:32 +0000 (UTC) Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-70b0116588cso11643b3a.3 for ; Thu, 18 Jul 2024 10:49:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721324940; x=1721929740; darn=lists.php.net; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:from:to:cc:subject:date:message-id :reply-to; bh=budCZ9bH0oUp5UdoVsJhCu9dB09rq0+6k3UXjdGNwlM=; b=VoxVnU354Ixi+7gktZEjz8kYufm2mOzTR5aWV/asidFyth0BUjs9GOid1MaPdtGlRy RA/bwdYIL76oRWWvoCCaPygwc8Ufqj2ekpVmYUEgGXE/tmS25Bz4p5BoQiMoZjMs1VCQ 8rxbagDsPbVx78arYMrQI2WjSiy6ssPqqZjc1OOjdWkBOiukU1ht4Yg3s9yaSvc4ctL8 2AtRbUOPtUk/8Dk33HZIqx+hrlB8KWXK3qbC7fTuwlSscJVywPQfNi/zd/ixhqA0rrqj e5rvv8AtE+Xh9X3gZV1+BT+9AeeZgGIkHyv0U3rXvtFunz1lDvkTPdGWUNfZauRzeCyZ 3Efw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721324940; x=1721929740; h=to:in-reply-to:cc:references:message-id:date:subject:mime-version :from:content-transfer-encoding:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=budCZ9bH0oUp5UdoVsJhCu9dB09rq0+6k3UXjdGNwlM=; b=HUPwJ81FjnKgYS6W8HCk1Nq1kfLMVtDx+eHOHenPqvLHtCgZV2ApUkEDEVb11zGnpD aIOs+NwuH4B3EHN0x4ibY1a1aKCbE5HjsLQUpdwDiNSWJeJTYQPCABBvt/fJNKlNBA/6 yeE0HtpY8TRlsDDq6vOI+/8y2v3YojiOeWPQqclLMqam2x0HtdtodQ34Y0T0EgAt6c3t hoTlA5AwZoGhTA/bUfgzhSj8+BKZBT3G3eM1bqlkzAkUVoYE+MeTpr1unMi0sOnnb1P2 GQdDJZ9FLsgDXKHdqLxSwYNPozccbCx+JSb8XkxuGUUlGRh1gd/8SCzzb+hQOFoUptaR qcZQ== X-Gm-Message-State: AOJu0YwofpWcxAHj9qM5GgpwMEJbq1tmMcdLdGB71X0yXM3B6IudzkSl /7lJM/ickkT+bjgVOwHId2RK66bVc95Nc0RXZFxtwp/XUrbpmTHgMG9Mt1VK X-Google-Smtp-Source: AGHT+IGwwKQr9aQpCB/eGw06yR0rxY0lF4YCaiBbdMsX4E73VWVp2yY2GMovxh7uWAIg+ShTobN2kA== X-Received: by 2002:a05:6a20:d504:b0:1c2:9009:f6f with SMTP id adf61e73a8af0-1c418375b11mr70762637.4.1721324939814; Thu, 18 Jul 2024 10:48:59 -0700 (PDT) Received: from smtpclient.apple ([2804:c88:cafe:c692:4884:6afe:ba08:4304]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70cfc81d9c7sm96524b3a.148.2024.07.18.10.48.59 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 18 Jul 2024 10:48:59 -0700 (PDT) Content-Type: multipart/alternative; boundary=Apple-Mail-F006E045-4A69-4BB6-A74D-6801D61E3F1E Content-Transfer-Encoding: 7bit Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow Mime-Version: 1.0 (1.0) Subject: Re: [PHP-DEV] Request for opinions: bug vs feature - change in tokenization of yield from Date: Thu, 18 Jul 2024 14:48:47 -0300 Message-ID: References: <66984FD0.5090805@adviesenzo.nl> Cc: PHP internals In-Reply-To: <66984FD0.5090805@adviesenzo.nl> To: Juliette Reinders Folmer X-Mailer: iPhone Mail (21F90) From: deleugyn@gmail.com (=?utf-8?Q?Marco_Aur=C3=A9lio_Deleu?=) --Apple-Mail-F006E045-4A69-4BB6-A74D-6801D61E3F1E Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable > On 17 Jul 2024, at 20:16, Juliette Reinders Folmer wrote: >=20 > =EF=BB=BF Hi all, >=20 > I recently discovered a change was made to the Tokenizer in PHP 8.3.0 whic= h now allows for a comment to exist between the `yield` and `from` keyword f= rom the `yield from` keyword. > Before PHP 8.3, this was a parse error. >=20 > This change was not documented (anywhere) nor publicized, which is why I o= nly recently came across it and opened a bug report on GitHub about it [1]. >=20 > This change is a breaking change for projects consuming the token stream, b= ut more than that, it introduces an inconsistency in the Tokenizer as there i= s no other single token in PHP which can contain a comment (aside from comme= nts tokens of course). > Comments were even explicitly forbidden in the PHP 8.0 RFC which changed t= he tokenization of namespaced names [2]. > Some more detailed explanation and examples can be found in the GH thread [= 3] and a detailed analysis of the tokenization change can be found in a tick= et in the PHP_CodeSniffer repo [4] (details in a fold out in the comment). >=20 > In my opinion the change is a bug and should be reverted, but opinions on t= his are divided. > After a discussion in the ticket on GitHub, it was suggested to ask for a t= hird/fourth/fifth opinion here on the list. >=20 > Pros for reverting it: > - Consistency with other tokens in the token stream, none of which allow c= omments within the token. > - Consistency with the 7 years before in which this was a parse error. > - The change was never documented, not in the changelog, news, nor in the u= pgrading guide. >=20 > Cons against reverting it: > - The change has been in PHP 8.3 releases for a little over seven months. > - The odd few people may have accidentally discovered the bug and taken ad= vantage by introducing code containing `yield /*comment*/ from` into their p= roject, which with a revert would become a parse error again. >=20 > Opinions ? >=20 > Of course, the ticket yielded discussion on whether the tokenization of `y= ield from` should be changed anyhow, but please consider that discussion out= of scope. > The current question is only about reverting the bug versus regarding it a= s a new feature and properly documenting it. >=20 > Smile, > Juliette >=20 >=20 > 1: https://github.com/php/php-src/issues/14926 > 2: https://wiki.php.net/rfc/namespaced_names_as_token#backward_incompatibl= e_changes > 3: https://github.com/PHPCSStandards/PHP_CodeSniffer/issues/529#issuecomme= nt-2209337419 > 4: https://github.com/php/php-src/issues/14926#issuecomment-2227152061 >=20 My opinion would be to revert it. Chances of someone accidentally discoverin= g and actually using it seems unlikely at best. Even then, the parser error c= ould be easily resolved by moving the comment out of the middle of the token= with insignificant readability change. Forcing all tooling that uses token_= get_all() to handle this unintentional change seems to generate more unneces= sary and real busywork for something only theoretical possible to break.= --Apple-Mail-F006E045-4A69-4BB6-A74D-6801D61E3F1E Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

On 17 Jul 2024, at 20:16, Juliette Reinders Folme= r <php-internals_nospam@adviesenzo.nl> wrote:

=EF=BB=BF =20 =20 =20 Hi all,

I recently discovered a change was made to the Tokenizer in PHP 8.3.0 which now allows for a comment to exist between the `yield` and `from` keyword from the `yield from` keyword.
Before PHP 8.3, this was a parse error.

This change was not documented (anywhere) nor publicized, which is why I only recently came across it and opened a bug report on GitHub about it [1].

This change is a breaking change for projects consuming the token stream, but more than that, it introduces an inconsistency in the Tokenizer as there is no other single token in PHP which can contain a comment
(aside from comments tokens of course).
Comments were even explicitly forbidden in the PHP 8.0 RFC which changed the tokenization of namespaced names [2].
Some more detailed explanation and examples can be found in the GH thread [3]
and a detailed analysis of the tokenization change can be found in a ticket in the PHP_CodeSniffer repo [4] (details in a fold out in the comment).

In my opinion the change is a bug and should be reverted, but opinions on this are divided.
After a discussion in the ticket on GitHub, it was suggested to ask for a third/fourth/fifth opinion here on the list.

Pros for reverting it:
- Consistency with other tokens in the token stream, none of which allow comments within the token.
- Consistency with the 7 years before in which this was a parse error.
- The change was never documented, not in the changelog, news, nor in the upgrading guide.

Cons against reverting it:
- The change has been in PHP 8.3 releases for a little over seven months.
- The odd few people may have accidentally discovered the bug and taken advantage by introducing code containing `yield /*comment*/ from` into their project, which with a revert would become a parse error again.

Opinions ?

Of course, the ticket yielded discussion on whether the tokenization= of `yield from` should be changed anyhow, but please consider that discussion out of scope.
The current question is only about reverting the bug versus regarding it as a new feature and properly documenting it.

Smile,
Juliette



1: https://github.com/php/php-src/issues/14926
2:
https://wiki.= php.net/rfc/namespaced_names_as_token#backward_incompatible_changes
3
: https://github.com/PHPC= SStandards/PHP_CodeSniffer/issues/529#issuecomment-2209337419
4: https://github.com/php/php-src/i= ssues/14926#issuecomment-2227152061

=20

My opinion would be to revert it. Chances of som= eone accidentally discovering and actually using it seems unlikely at best. E= ven then, the parser error could be easily resolved by moving the comment ou= t of the middle of the token with insignificant readability change. Forcing a= ll tooling that uses token_get_all() to handle this unintentional change see= ms to generate more unnecessary and real busywork for something only theoret= ical possible to break.
= --Apple-Mail-F006E045-4A69-4BB6-A74D-6801D61E3F1E--