Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126279 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id EB1761A00BC for ; Tue, 4 Feb 2025 07:51:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1738655350; bh=4NpxWGXcRLQMNk4ey2NKu7lt5JCxfulM+2gzalLW0fE=; h=Date:From:To:In-Reply-To:References:Subject:From; b=QbQCNRgOlNY5UJv2vZTvny5bDCOroZOm8seCHM8Fz4VdtfMn3kT7srHH8qRAhaYo3 TpBgggOqmuUDDWBwxU82EYm15yWz/OC2uD8lXLIknGGv2fZL2lbsN8t9qi524jNL9G bVj7qFcKvs60VrTykiQxajlBpxKEKt1NcUKx6zW+543hCXra1haGhP/5EoihvIPE+9 +7b9YE7+op0osvH1TycAO4P66seSPftDIY82dDtC0+fEHoxpvjaYOszkZxL1FhqrLZ zZ6CqTOwmUP0yndNrwI+gkfg65qr9NZoxs5ObfM7meCgX2zYDDf0L2trEc6/MmUaF5 Wbl6TlB7xUyQQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 66F3B18004A for ; Tue, 4 Feb 2025 07:49:09 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,HTML_MESSAGE, RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fhigh-b4-smtp.messagingengine.com (fhigh-b4-smtp.messagingengine.com [202.12.124.155]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 4 Feb 2025 07:49:08 +0000 (UTC) Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfhigh.stl.internal (Postfix) with ESMTP id 63F8F254015E for ; Tue, 4 Feb 2025 02:51:54 -0500 (EST) Received: from phl-imap-09 ([10.202.2.99]) by phl-compute-01.internal (MEProxy); Tue, 04 Feb 2025 02:51:54 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bottled.codes; h=cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm1; t=1738655514; x=1738741914; bh=fZ0mzia30u fO4HfI3rBo1wNY9HRD9a8J3AiLy+7GyTU=; b=ZLCfKxnyO+x/I/7zRuXipo6/8i MUpN23PNVj6K9LdNT5QbjspCgjEFKU5RD30GAZgjeuEyIsxfNDpL6mQQdQjuKDfg Mh2oOgO/vGKrim7VGPId0hvTJlnENxuUf+f1oTfZ8gY36o+zbxjLpUNRKgcwrYLK HCgO+tXCx2WoRF6yH33GFJPkFOguh/zqSoMO3D5isWy/zhRnG3LNNnchXYHV3X3T N4VslGAjNH+7X+hlllMjM5WYkkD5fH+JAzrYxkJaNbOICWEeE5k732bFZN2fSyNL sGK1bRVMHqW0M0rC0ZA/1NCXDmUIWBU0yeDkT8n1sOSUN2xUFXpnJXtk3pgA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1738655514; x=1738741914; bh=fZ0mzia30ufO4HfI3rBo1wNY9HRD9a8J3Ai Ly+7GyTU=; b=ExmPk0ERD2lUf3atVA78NPcaLrs9xnI4jZp3VVW8iMWmi4VXafu Fr4zszSZB1L/VIQmlhXm2xiPLQMMSpBOhSHO8YABbIdXANUX8LJ47oDI3X0JHtSP 6D5yk1pbDHhDQqZyLhV3iufNQ93Nru9ynFLVat6Alqn+Rnq4HZNjded4KmnIlbMc C1yC3u9ujsxl+PgdTTifc9l+btMGrchjAMUx42GYzrwYC14SqsIVJg37z1XKNaDL jIf0lt98hNLp09u4zoMTHW7fd2mfJqFmY0cIbSdLfQDqFLLpWrIsMBvyhS5nJ0TD 0RXU9lH1Z6xQxWOVvJaqfdjLjN+pwhNorDA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdduleelhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecunecujfgurhepofggff fhvffkjghfufgtsegrtderreertdejnecuhfhrohhmpedftfhosgcunfgrnhguvghrshdf uceorhhosgessghothhtlhgvugdrtghouggvsheqnecuggftrfgrthhtvghrnhepffduud fhjedtveejvdehgedtfefgfeeggedtheeiveevgeevueekgfefueefffelnecuffhomhgr ihhnpehgihhthhhusgdrtghomhdprhgvshgvrghrtghhrdhmugenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehrohgssegsohhtthhlvggurdgt ohguvghspdhnsggprhgtphhtthhopedupdhmohguvgepshhmthhpohhuthdprhgtphhtth hopehinhhtvghrnhgrlhhssehlihhsthhsrdhphhhprdhnvght X-ME-Proxy: Feedback-ID: ifab94697:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id C5B33780068; Tue, 4 Feb 2025 02:51:53 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Tue, 04 Feb 2025 08:51:20 +0100 To: internals@lists.php.net Message-ID: In-Reply-To: References: Subject: Re: [PHP-DEV] Pattern matching details questions Content-Type: multipart/alternative; boundary=52015aa141ba450d847d373ceb75da25 From: rob@bottled.codes ("Rob Landers") --52015aa141ba450d847d373ceb75da25 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Tue, Feb 4, 2025, at 05:31, Larry Garfield wrote: > Hi folks. Ilija is still working on the implementation for the patter= n matching RFC, which we want to complete before proposing it officially= in case we run into implementation challenges. >=20 > Such as these, on which we'd like feedback on how to proceed. >=20 > ## Object property patterns >=20 > Consider this code snippet: >=20 > class C { > public $prop =3D 42 is Foo{}; > } >=20 > The parser could interpret this in multiple ways: >=20 > * make a public property named $prop, with default value of "the resul= t of `42 is Foo`" (which would be false), and then has an empty (and th= erefore invalid) property hook block > * make a public property named $prop, with default value of "whatever = the result of `42 is Foo {}`" (which would be false). >=20 > Since the parser doesn't allow for ambiguity, this is not workable. B= ecause PHP uses only an LL(1) parser, there's no way to "determine from = context" which is intended, eg, by saying "well the hook block is empty = so it must have been part of the pattern before it." >=20 > In practice, no one should be writing code like the above, as it's nee= dlessly nonsensical (it's statically false in all cases), but the parser= doesn't know that. >=20 > The only solution we've come up with is to not have object patterns, b= ut have a property list pattern that can be compounded with a type patte= rn. Eg: >=20 > $p is Point & {x: 5} >=20 > Instead of what we have now: >=20 > $p is Point{x: 5} >=20 > Ilija says this will resolve the parsing issue. It would also make it= possible to match `$p is {x: 5}`, which would not check the type of $p = at all, just that it's an object with an $x property with value 5. That= is arguably a useful feature in some cases, but does make the common ca= se (matching type and properties) considerably more clunky. >=20 > So, questions: >=20 > 1. Would splitting the object pattern like that be acceptable? > 2. Does someone have a really good alternate suggestion that wouldn't = confuse the parser? >=20 > ## Variable binding and pinning >=20 > Previously there was much discussion about the syntax we wanted for th= ese features. In particular, variable binding means "pull a sub-value o= ut of the matched value to its own variable, if the pattern matches." V= ariable pinning means "use some already-existing variable here to dynami= cally form the pattern." Naturally, these cannot both just be a variabl= e name on their own, as that would be confusing (both for users and the = engine). >=20 > For example: >=20 > $b =3D '12'; >=20 > if ($arr is ['a' =3D> assign to $a, 'b' =3D> assert is equal to $b]) { > print $a; > } >=20 > Based on my research[1], the overwhelming majority of languages use a = bare variable name to indicate variable binding. Only one language, Rub= y, has variable pinning, which it indicates with a ^ prefix. Following = Ruby's lead, as the RFC text does right now, would yield: >=20 > $b =3D '12'; >=20 > if ($arr is ['a' =3D> $a, 'b' =3D> ^$b]) { > print $a; > } >=20 > That approach would be most like other languages with pattern matching. >=20 > However, there is a concern that it wouldn't be self-evident to PHP de= vs, and the variable binding side should have the extra marker. Ilija h= as suggested &, as that's what's used for references, which would result= in: >=20 > $b =3D '12'; >=20 > if ($arr is ['a' =3D> &$a, 'b' =3D> $b]) { > print $a; > } >=20 > There are two concerns with this approach. >=20 > 1. The & could get confusing with an AND conjunction, eg, `$value is i= nt & &$x` (which is how you would bind $value to $x iff it is an integer= ). > 2. In practice, binding is almost certainly going to be vastly more co= mmon than pinning, so it should likely have the shorter syntax. >=20 > There are of course other prefixes that could be used, such as `let` (= introduces a new keyword, possibly confusing as it wouldn't imply scope = restrictions like in other languages) or `var` (no new keyword, but coul= d still be confusing and it's not obvious which side should get it), but= ^ is probably the only single-character option. >=20 > So, question: >=20 > 1. Are you OK with the current Ruby-inspired syntax? ($a means bind, = ^$b means pin.) > 2. If not, have you a counter-proposal that would garner consensus? >=20 >=20 > Thanks all. >=20 > [1] https://github.com/Crell/php-rfcs/blob/master/pattern-matching/res= earch.md >=20 > --=20 > Larry Garfield > larry@garfieldtech.com >=20 Hey Larry, Instead of symbols, why not use words? We already have &&, but it looks like this uses & instead, which is a bi= twise-and. But the language does have =E2=80=9Cand=E2=80=9D as a keyword= . So instead of: $value is int & &$x It would be: $value is int and &$x Which removes the confusion you mentioned before (also for someone like = me who uses bitwise-and quite a bit). =E2=80=94 Rob --52015aa141ba450d847d373ceb75da25 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

=
On Tue, Feb 4, 2025, at 05:31, Larry Garfield wrote:
<= /div>
Hi folks. = Ilija is still working on the implementation for the pattern matching R= FC, which we want to complete before proposing it officially in case we = run into implementation challenges.

Such as= these, on which we'd like feedback on how to proceed.
## Object property patterns

Con= sider this code snippet:

class C {
    public $prop =3D 42 is Foo{};
}

The parser could interpret this in multiple = ways:

* make a public property named $prop,= with default value of "the result of `42 is Foo`" (which would be false= ), and then has an  empty (and therefore invalid) property hook blo= ck
* make a public property named $prop, with default valu= e of "whatever the result of `42 is Foo {}`" (which would be false).
=

Since the parser doesn't allow for ambiguity, = this is not workable.  Because PHP uses only an LL(1) parser, there= 's no way to "determine from context" which is intended, eg, by saying "= well the hook block is empty so it must have been part of the pattern be= fore it."

In practice, no one should be wri= ting code like the above, as it's needlessly nonsensical (it's staticall= y false in all cases), but the parser doesn't know that.
<= br>
The only solution we've come up with is to not have object= patterns, but have a property list pattern that can be compounded with = a type pattern.  Eg:

$p is Point &= {x: 5}

Instead of what we have now:

$p is Point{x: 5}

Il= ija says this will resolve the parsing issue.  It would also make i= t possible to match `$p is {x: 5}`, which would not check the type of $p= at all, just that it's an object with an $x property with value 5. = ; That is arguably a useful feature in some cases, but does make the com= mon case (matching type and properties) considerably more clunky.

So, questions:

1. Wo= uld splitting the object pattern like that be acceptable?
= 2. Does someone have a really good alternate suggestion that wouldn't co= nfuse the parser?

## Variable binding and p= inning

Previously there was much discussion= about the syntax we wanted for these features.  In particular, var= iable binding means "pull a sub-value out of the matched value to its ow= n variable, if the pattern matches."  Variable pinning means "use s= ome already-existing variable here to dynamically form the pattern."&nbs= p; Naturally, these cannot both just be a variable name on their own, as= that would be confusing (both for users and the engine).
=
For example:

$b =3D '12';

if ($arr is ['a' =3D> assign to $a, 'b' =3D= > assert is equal to $b]) {
    print $a= ;
}

Based on my research[1], = the overwhelming majority of languages use a bare variable name to indic= ate variable binding.  Only one language, Ruby, has variable pinnin= g, which it indicates with a ^ prefix.  Following Ruby's lead, as t= he RFC text does right now, would yield:

$b= =3D '12';

if ($arr is ['a' =3D> $a, 'b'= =3D> ^$b]) {
    print $a;
}

That approach would be most like other = languages with pattern matching.

However, t= here is a concern that it wouldn't be self-evident to PHP devs, and the = variable binding side should have the extra marker.  Ilija has sugg= ested &, as that's what's used for references, which would result in= :

$b =3D '12';

if ($arr is ['a' =3D> &$a, 'b' =3D> $b]) {
&nbs= p;   print $a;
}

Th= ere are two concerns with this approach.

1.= The & could get confusing with an AND conjunction, eg, `$value is i= nt & &$x` (which is how you would bind $value to $x iff it is an= integer).
2. In practice, binding is almost certainly goi= ng to be vastly more common than pinning, so it should likely have the s= horter syntax.

There are of course other pr= efixes that could be used, such as `let` (introduces a new keyword, poss= ibly confusing as it wouldn't imply scope restrictions like in other lan= guages) or `var` (no new keyword, but could still be confusing and it's = not obvious which side should get it), but ^ is probably the only single= -character option.

So, question:
<= div>
1. Are you OK with the current Ruby-inspired syntax?&= nbsp; ($a means bind, ^$b means pin.)
2. If not, have you = a counter-proposal that would garner consensus?

=

Thanks all.

[1] https://github.com/Crell/php-rfcs/blob/master/pattern-match= ing/research.md

-- 
=   Larry Garfield


Hey Larry,

In= stead of symbols, why not use words?

We alr= eady have &&, but it looks like this uses & instead, which i= s a bitwise-and. But the language does have =E2=80=9Cand=E2=80=9D as a k= eyword. So instead of:

$value is int & = &$x

It would be:

$value is int and &$x

Which remo= ves the confusion you mentioned before (also for someone like me who use= s bitwise-and quite a bit).

=E2=80=94 Rob
--52015aa141ba450d847d373ceb75da25--