Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:124505 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id A88D21A00B7 for <internals@lists.php.net>; Fri, 19 Jul 2024 05:22:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1721366662; bh=PMFwzwU64fvicVx6N6IbhpeKNKnUmrjsk8Xh/eKjpY4=; h=Subject:To:References:From:Date:In-Reply-To:From; b=ZI3GlhlKR9opI0TBvn9o7beI2W5ibRgMJSrGtZisfQwi5+A9RjalpXbdSrQYTzDJR XjxB5t+CAV3fix02dt3y/Ux9MSsh9XyCz53iKgGRiSyB69DHxHOBZD/OGhiOq+KQJF Hcp2GcvA4UbP7l2h4SmiFF/X/AWCxVIDjQciHprWq1VhMLzx59GRaVBDzePIFZsUE+ FgCnz1QRVM7ad3AsTUR711hjYHZJdOl24HHRbPZ+0XTxWD8n7rXmchIaFIfZ765GAP 6vQMMO/IabzmtrxJiGCYU9ltziOkHptdYQZb5VXG9VHxkBcIsAbcnWugdGOZBdfWX5 6RP9SZLZDB/uw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 22ACB180039 for <internals@lists.php.net>; Fri, 19 Jul 2024 05:24:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: ** X-Spam-Status: No, score=2.4 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_50, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING, HTML_MESSAGE,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_SOFTFAIL autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: <php-internals_nospam@adviesenzo.nl> Received: from aye.elm.relay.mailchannels.net (aye.elm.relay.mailchannels.net [23.83.212.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for <internals@lists.php.net>; Fri, 19 Jul 2024 05:24:20 +0000 (UTC) X-Sender-Id: a2hosting|x-authuser|juliette@adviesenzo.nl Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 48FB490593D for <internals@lists.php.net>; Fri, 19 Jul 2024 05:22:48 +0000 (UTC) Received: from nl1-ss105.a2hosting.com (unknown [127.0.0.6]) (Authenticated sender: a2hosting) by relay.mailchannels.net (Postfix) with ESMTPA id 5EC10901FBB for <internals@lists.php.net>; Fri, 19 Jul 2024 05:22:47 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1721366567; a=rsa-sha256; cv=none; b=jAas0SVmo1pi7id3EzIhZW/r+YUTqTwmFTIx2fR2JlhXamPIADzbsOki7MnqMf/nI+dfL7 71VQG1hpdiFqa0P9zN2eXon9U1138iL5yGnzfl9cAVmT35Etl7ntYm+GklAb02HnSpd4Yp kGQYP/e15vnmKucPRBCvIVdfwObaQus316g65uO7vaXwUhhkCxe47p3vrawRgt1vPyE8B5 lFDHNM1DNjFUvYj0kk+d1foVgXoaoYEGvIpLSl0esDfixSpdhSkdls5ejESd/7U7X2akRn /iIwepZ5LxHe2zokxP3mmNoW9KkP1ZvhY63koqgTEjiKx4uytoZyCUi7pdZVNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1721366567; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references:dkim-signature; bh=UC5+1RZ0kD+8iq/HmTXd5H6yeeChBJMNfCXEB26aou0=; b=kcHAL8UBT0rgABNLobo7O9dLSSAKPTOnRjbSZTYT3qiW3/U0ZA/+Rim68Y3jbuZECb7ayC 6s3dlgMTnuWEq81X56ZsQQOCI/iAbcxQQfJjD+zbF+vxgG0aJ1UchJEBhe+P+3X+/QnT8O lrbp5qexPkks+QPJIc0DDgHO5vGa6oK964iySOQj0/t+2W8SWA4fVj+JjtoC82oJT0Wuyc wkrmln49qmlBjnowSyRA5WKh5qQJiXLigu+oKa/25QtNjt31vXmQW/tko9wM1ExhlDggmm KaydsT/bxTjkO8Vb0WWY7SMiEURden62VMs1DSWNesVEPrz16Y122fAr9rKzsQ== ARC-Authentication-Results: i=1; rspamd-5d9c874f6d-pnjhc; auth=pass smtp.auth=a2hosting smtp.mailfrom=php-internals_nospam@adviesenzo.nl X-Sender-Id: a2hosting|x-authuser|juliette@adviesenzo.nl X-MC-Relay: Neutral X-MailChannels-SenderId: a2hosting|x-authuser|juliette@adviesenzo.nl X-MailChannels-Auth-Id: a2hosting X-Squirrel-Eyes: 470e881d7a96c532_1721366567922_2626616023 X-MC-Loop-Signature: 1721366567921:2044219366 X-MC-Ingress-Time: 1721366567921 Received: from nl1-ss105.a2hosting.com (nl1-ss105.a2hosting.com [85.187.142.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.123.133.76 (trex/7.0.2); Fri, 19 Jul 2024 05:22:47 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=adviesenzo.nl; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Sender:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=UC5+1RZ0kD+8iq/HmTXd5H6yeeChBJMNfCXEB26aou0=; b=SAQBotLoClHoqBACSBdUv5Wg2w rxKPo9BJWXOgjfDOUynXjUe5uJy1YoKaSHTFdtYwTHuDW+f9z1mtkKx1ZuBu7Hzlmflkd9yQlNsTF X8XQXDMfIwkYRXOREbTe44sGH+iv7zGeIMnCkr7YDxjNWBeKjGaRrVnEAklMD79bSZ5M=; Received: from mailnull by nl1-ss105.a2hosting.com with spam-scanner (Exim 4.97.1) (envelope-from <php-internals_nospam@adviesenzo.nl>) id 1sUg4r-00000005OHT-2dUY for internals@lists.php.net; Fri, 19 Jul 2024 07:22:45 +0200 X-ImunifyEmail-Filter-Info: UkNWRF9WSUFfU01UUF9BVVRIIFJDVkRfVExTX0FMTCBWRVJJ TE9DS19 DQiBSQ1ZEX0NPVU5UX09ORSBCQVlFU19IQU0gTUlNRV9VTktOT1dOIE FSQ19OQSBNSURfUkhTX01BVENIX0ZST00gSUVfVkxfUEJMX0FDQ09VT lRfMDUgTUlNRV9UUkFDRSBGUk9NX0VRX0VOVkZST00gRlJPTV9IQVNf RE4gVE9fRE5fTk9ORSBSQ1BUX0NPVU5UX09ORSBJRV9WTF9QQkxfQUN DT1VOVF8wMSBUT19NQVRDSF9FTlZSQ1BUX0FMTCBfRFJVR1NfTU1fRE lTQ09VTlQgQVNO X-ImunifyEmail-Filter-Action: no action X-ImunifyEmail-Filter-Score: 0.87 X-ImunifyEmail-Filter-Version: 3.5.16/202407190044 Received: from [31.201.40.213] (port=65210 helo=[192.168.1.16]) by nl1-ss105.a2hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.97.1) (envelope-from <php-internals_nospam@adviesenzo.nl>) id 1sUg4u-00000005OGy-3rnR for internals@lists.php.net; Fri, 19 Jul 2024 07:22:45 +0200 Subject: Re: [PHP-DEV] Request for opinions: bug vs feature - change intokenization of yield from To: internals@lists.php.net References: <66984FD0.5090805@adviesenzo.nl> <AM8P250MB0170FFCA0014FB9EC272DB4FE2AC2@AM8P250MB0170.EURP250.PROD.OUTLOOK.COM> <c9736385-94b7-4064-911f-e5fc1df7e2bd@gmx.de> <AM8P250MB0170B4C59B4313A565CD029AE2AC2@AM8P250MB0170.EURP250.PROD.OUTLOOK.COM> Message-ID: <6699F817.8070806@adviesenzo.nl> Date: Fri, 19 Jul 2024 07:22:31 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 Precedence: bulk list-help: <mailto:internals+help@lists.php.net list-unsubscribe: <mailto:internals+unsubscribe@lists.php.net> list-post: <mailto:internals@lists.php.net> List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 In-Reply-To: <AM8P250MB0170B4C59B4313A565CD029AE2AC2@AM8P250MB0170.EURP250.PROD.OUTLOOK.COM> Content-Type: multipart/alternative; boundary="------------060003050400090207020206" X-AuthUser: juliette@adviesenzo.nl From: php-internals_nospam@adviesenzo.nl (Juliette Reinders Folmer) This is a multi-part message in MIME format. --------------060003050400090207020206 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit On 19-7-2024 1:09, Bob Weinand wrote: > Hey Christoph, > >> Am 19.07.2024 um 00:51 schrieb Christoph M. Becker <cmbecker69@gmx.de>: >> >> Hi Bob! >> >> On 18.07.2024 at 15:41, Bob Weinand wrote: >> >>> Moreover, it can - at least - be worked around in tooling by special >>> casing the T_YIELD_FROM token and extracting the comment from the >>> raw parsed string: >>> >>> var_dump(token_get_all('<?php yield /* comment */ from $foo;')); >>> >>> will contain: >>> >>> [1]=> array(3) { [0]=> int(270) [1]=> string(24) "yield /* comment >>> */ from" [2]=> int(1) } >>> >>> It's not optimal, but probably the least bad solution to leave it >>> unchanged in PHP 8.3, have tooling special case it and properly fix >>> it in PHP 8.4. >> >> And what about "code" like <https://3v4l.org/4CLhM>? Is Codesniffer >> supposed to scan the result of <https://3v4l.org/dKDcs> for possible CS >> violations? >> >> Cheers, >> Christoph > > I suppose you mean https://3v4l.org/IMi8Y, (you missed the <?php tag). > If you want to scan that, it's quite easy to strip the leading yield > and trailing from, and tokenize that again to extract all comments. > > Sure, it's a hack, but it'll work: https://3v4l.org/8eAiV. > > Bob > Hi Bob, Of course, everything can be hacked around, but that still leaves the question what should be the "proper tokenization". Having this change in PHP 8.3 and then - as you suggest - yet another in PHP 8.4, makes it mighty hard to have a consistent token stream in tooling, especially as it is unclear what the "proper tokenization" should/would be. More than anything, I find it concerning that this change sets a precedent for tokens to include comments. Just as an example: what does this mean for the PHP 8.0 nullsafe object operator ? Should we now suddenly allow that to be written as `? /*comment*/ ->` ? Or what about a cast token ? Should that be allowed to be `(string /*for reasons*/)` ? Allowing this change to stay in, without having the discussion about what the "proper tokenization" should be, feels off and random to me and opens the door for more random changes. As for the impact on tooling: a change in the tokenization of any token has an impact not only on tooling like PHPCS itself, but also on every single external standard build on top of it and is a breaking change. To give you some perspective - for PHPCS we even went as far as to "undo" the PHP 8.0 tokenization of namespaced names for the time being (in the PHPCS 3.x releases) and we'll only change the PHPCS tokenizer to use the PHP 8.0 tokenization in the PHPCS 4.0 release as it would otherwise break too many existing sniffs. [1] Smile, Juliette 1: https://github.com/squizlabs/PHP_CodeSniffer/issues/3041 --------------060003050400090207020206 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit <html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">On 19-7-2024 1:09, Bob Weinand wrote:<br> </div> <blockquote cite="mid:AM8P250MB0170B4C59B4313A565CD029AE2AC2@AM8P250MB0170.EURP250.PROD.OUTLOOK.COM" type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <div dir="ltr">Hey Christoph,</div> <div dir="ltr"><br> <blockquote type="cite">Am 19.07.2024 um 00:51 schrieb Christoph M. Becker <a class="moz-txt-link-rfc2396E" href="mailto:cmbecker69@gmx.de"><cmbecker69@gmx.de></a>:<br> <br> </blockquote> </div> <blockquote type="cite"> <div dir="ltr"><span>Hi Bob!</span><br> <span></span><br> <span>On 18.07.2024 at 15:41, Bob Weinand wrote:</span><br> <span></span><br> <blockquote type="cite"><span>Moreover, it can - at least - be worked around in tooling by special casing the T_YIELD_FROM token and extracting the comment from the raw parsed string:</span><br> </blockquote> <blockquote type="cite"><span></span><br> </blockquote> <blockquote type="cite"><span>var_dump(token_get_all('<?php yield /* comment */ from $foo;'));</span><br> </blockquote> <blockquote type="cite"><span></span><br> </blockquote> <blockquote type="cite"><span>will contain:</span><br> </blockquote> <blockquote type="cite"><span></span><br> </blockquote> <blockquote type="cite"><span>[1]=> array(3) { [0]=> int(270) [1]=> string(24) "yield /* comment */ from" [2]=> int(1) }</span><br> </blockquote> <blockquote type="cite"><span></span><br> </blockquote> <blockquote type="cite"><span>It's not optimal, but probably the least bad solution to leave it unchanged in PHP 8.3, have tooling special case it and properly fix it in PHP 8.4.</span><br> </blockquote> <span></span><br> <span>And what about "code" like <a class="moz-txt-link-rfc2396E" href="https://3v4l.org/4CLhM"><https://3v4l.org/4CLhM></a>? Is Codesniffer</span><br> <span>supposed to scan the result of <a class="moz-txt-link-rfc2396E" href="https://3v4l.org/dKDcs"><https://3v4l.org/dKDcs></a> for possible CS</span><br> <span>violations?</span><br> <span></span><br> <span>Cheers,</span><br> <span>Christoph</span><br> </div> </blockquote> <br> <div>I suppose you mean <a moz-do-not-send="true" href="https://3v4l.org/IMi8Y">https://3v4l.org/IMi8Y</a>, (you missed the <?php tag).</div> <div>If you want to scan that, it's quite easy to strip the leading yield and trailing from, and tokenize that again to extract all comments.</div> <div><br> </div> <div>Sure, it's a hack, but it'll work: <a moz-do-not-send="true" href="https://3v4l.org/8eAiV">https://3v4l.org/8eAiV</a>.</div> <div><br> </div> <div>Bob</div> <div><br> </div> </blockquote> <br> Hi Bob,<br> <br> Of course, everything can be hacked around, but that still leaves the question what should be the "proper tokenization". Having this change in PHP 8.3 and then - as you suggest - yet another in PHP 8.4, makes it mighty hard to have a consistent token stream in tooling, especially as it is unclear what the "proper tokenization" should/would be.<br> <br> More than anything, I find it concerning that this change sets a precedent for tokens to include comments.<br> <br> Just as an example: what does this mean for the PHP 8.0 nullsafe object operator ? Should we now suddenly allow that to be written as `? /*comment*/ ->` ?<br> Or what about a cast token ? Should that be allowed to be `(string /*for reasons*/)` ?<br> <br> Allowing this change to stay in, without having the discussion about what the "proper tokenization" should be, feels off and random to me and opens the door for more random changes.<br> <br> As for the impact on tooling: a change in the tokenization of any token has an impact not only on tooling like PHPCS itself, but also on every single external standard build on top of it and is a breaking change.<br> To give you some perspective - for PHPCS we even went as far as to "undo" the PHP 8.0 tokenization of namespaced names for the time being (in the PHPCS 3.x releases) and we'll only change the PHPCS tokenizer to use the PHP 8.0 tokenization in the PHPCS 4.0 release as it would otherwise break too many existing sniffs. [1]<br> <br> Smile,<br> Juliette<br> <br> 1: <a class="moz-txt-link-freetext" href="https://github.com/squizlabs/PHP_CodeSniffer/issues/3041">https://github.com/squizlabs/PHP_CodeSniffer/issues/3041</a><br> </body> </html> --------------060003050400090207020206--