Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126304 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 2F46F1A00BC for ; Thu, 6 Feb 2025 09:06:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1738832595; bh=W7IZSBiKZ2Wz03Ptv7lGyvwdlbPESVgdfRtgVF+qr0I=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=e4TMHZ+KpodUJmeSjhMWbPMFRNCXxU6V8bF+uWF9Dakdvw5xTz+2xsR4+YZwiDzG0 JEBZfIsccCeiiDklNU4mpipdsHVCQsZndAdaF6UEIrAvpRMPYPmkKGGQW6oSQ2iWaI 6yU4uB8hW/6E+zWf0NQQ+qdIEufqq/Hy+m1LrN/BgYQj6Xxd6vC4Ph4+4XQzbcxIvz /LrEbvIkTnJCNsgOkzaa+wL/TJB4Hf2v9T/3i7aOU2D1pqn2msFcH0KXw44EUyXThB 5wv4sZ0DIPP54AOWJxHUlloIHY5aTn6tdVLr8mA8JBEO9fJSMRb/r+rPcgowjgGmz1 yhhuDBxh+1x7g== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B2D47180048 for ; Thu, 6 Feb 2025 09:03:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Thu, 6 Feb 2025 09:03:11 +0000 (UTC) Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-4361f796586so6518235e9.3 for ; Thu, 06 Feb 2025 01:05:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738832755; x=1739437555; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=cUbJWNDSNDBJAkwPIydcY7kpC0XpKHY5lW+LNFin+Rs=; b=TVLSMnwWywahZnyuTa3E1yUxY3z6Q9NHlrDhOEsypf1uXTPNrtRJQLDAH0CT4BRWk/ u2VsmhFG8ZGnBnfre9FeIDxrc+/Bt+PAg9dmhbjLKf/aetV9hxDzIK9/6m6oUKuvGACp GpiXp5YEnGT0SGXUUFevbzv/TKS0ByfHOBhyyBc7Ka9g9IV42yI7ubiFwZdC74s/a+cR 1fpGPRXlmPngwW1NMVZAXhw60txUDcyyF9vWxRhDuVKKPTgIjS43mksMjxOq18atnf8o lOEr/rvEZPin+4ZneOZSzyc8f5l3ELB32QMwH6kwjIuHCOkXtZ+F9hpwVtelLbd5W8Ym wFUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738832755; x=1739437555; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cUbJWNDSNDBJAkwPIydcY7kpC0XpKHY5lW+LNFin+Rs=; b=oXjwQFZXFrMxtq9lDIP0dM/DynihW6R50y5a9f7v4CGDBUjjSzMIRDylSswvxzh0fy Uqjj+zyUfzBU7ksKs9mRjcUvhkYZVJ0vT2eP8BmYF9GxkyA09ru0P6BSwT8XiPXCcmzY IDW8C1ba/FjTWEax4pGhWfqsBUt+HJYylUkoulCXXBP6LDOv0mlypZo69Jqy22rOwc/N 4wEE8uDzOrh/khD4xyBPzCfNKURlsQ1hKe+9NjqcEnNMzNfX3yA3ARod96Z0qUqyfYTP g7K6HHlEoQ9DYyNRQk8deMj7Epg4TS+GKCuJPnTgdkVVRqWczeCcsZSL9rlWu5IPxI1c /xZQ== X-Gm-Message-State: AOJu0YwxffK4ydjRqhKKepLTwWDaaEsS5cbMkjeArnOP1K0uWi5v7J08 LtBlYvpXPuri4rEAEGzHX1J4O8scyxQ5zBvUYLu4MnXdCUm12sHgslOSMns826YI5FTpSQHNKQn Palrlyevk2OfCM3EdZqz0xj/h3kbtpkBdSMVG X-Gm-Gg: ASbGncsd7mNAxWCaeTJrFXMITHYG0K4JA3nWRPYLzAHCSXn2qJBFfJlPNgEVbrKQI5y xAKu4TQuadK/qcBcM8cXAVNSoI4UmM6BcfHLUV3qSIUWolBbXV4IGONmoGiVr8dAYGWxYi1O2 X-Google-Smtp-Source: AGHT+IFGbXIkCWBhihz0vobn8CM0OZMQNCEyYevIgKRP2S119J8lpyghpP31G4/Hp/MqljCeGRP1665inB02ZSkDA/Q= X-Received: by 2002:a05:600c:5127:b0:436:1baa:de1c with SMTP id 5b1f17b1804b1-4390d43e4fdmr54052535e9.13.1738832755217; Thu, 06 Feb 2025 01:05:55 -0800 (PST) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 6 Feb 2025 12:05:43 +0300 X-Gm-Features: AWEUYZnlnaOJD8uJSq78qsrP5tkzi25B3Lo1V6H7EOBO7rIQZr8qKkJdJYDHgx8 Message-ID: Subject: Re: [PHP-DEV] Pattern matching details questions To: Larry Garfield Cc: php internals Content-Type: multipart/alternative; boundary="000000000000b2df27062d758c07" From: udaltsov.valentin@gmail.com (Valentin Udaltsov) --000000000000b2df27062d758c07 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 4 Feb 2025=E2=80=AFat 07:35, Larry Garfield wrote: > Hi folks. Ilija is still working on the implementation for the pattern > matching RFC, which we want to complete before proposing it officially in > case we run into implementation challenges. > > Such as these, on which we'd like feedback on how to proceed. > > ## Object property patterns > > Consider this code snippet: > > class C { > public $prop =3D 42 is Foo{}; > } > > The parser could interpret this in multiple ways: > > * make a public property named $prop, with default value of "the result o= f > `42 is Foo`" (which would be false), and then has an empty (and therefor= e > invalid) property hook block > * make a public property named $prop, with default value of "whatever the > result of `42 is Foo {}`" (which would be false). > > Since the parser doesn't allow for ambiguity, this is not workable. > Because PHP uses only an LL(1) parser, there's no way to "determine from > context" which is intended, eg, by saying "well the hook block is empty s= o > it must have been part of the pattern before it." > > In practice, no one should be writing code like the above, as it's > needlessly nonsensical (it's statically false in all cases), but the pars= er > doesn't know that. > > The only solution we've come up with is to not have object patterns, but > have a property list pattern that can be compounded with a type pattern. > Eg: > > $p is Point & {x: 5} > > Instead of what we have now: > > $p is Point{x: 5} > > Ilija says this will resolve the parsing issue. It would also make it > possible to match `$p is {x: 5}`, which would not check the type of $p at > all, just that it's an object with an $x property with value 5. That is > arguably a useful feature in some cases, but does make the common case > (matching type and properties) considerably more clunky. > > So, questions: > > 1. Would splitting the object pattern like that be acceptable? > 2. Does someone have a really good alternate suggestion that wouldn't > confuse the parser? > > ## Variable binding and pinning > > Previously there was much discussion about the syntax we wanted for these > features. In particular, variable binding means "pull a sub-value out of > the matched value to its own variable, if the pattern matches." Variable > pinning means "use some already-existing variable here to dynamically for= m > the pattern." Naturally, these cannot both just be a variable name on > their own, as that would be confusing (both for users and the engine). > > For example: > > $b =3D '12'; > > if ($arr is ['a' =3D> assign to $a, 'b' =3D> assert is equal to $b]) { > print $a; > } > > Based on my research[1], the overwhelming majority of languages use a bar= e > variable name to indicate variable binding. Only one language, Ruby, has > variable pinning, which it indicates with a ^ prefix. Following Ruby's > lead, as the RFC text does right now, would yield: > > $b =3D '12'; > > if ($arr is ['a' =3D> $a, 'b' =3D> ^$b]) { > print $a; > } > > That approach would be most like other languages with pattern matching. > > However, there is a concern that it wouldn't be self-evident to PHP devs, > and the variable binding side should have the extra marker. Ilija has > suggested &, as that's what's used for references, which would result in: > > $b =3D '12'; > > if ($arr is ['a' =3D> &$a, 'b' =3D> $b]) { > print $a; > } > > There are two concerns with this approach. > > 1. The & could get confusing with an AND conjunction, eg, `$value is int = & > &$x` (which is how you would bind $value to $x iff it is an integer). > 2. In practice, binding is almost certainly going to be vastly more commo= n > than pinning, so it should likely have the shorter syntax. > > There are of course other prefixes that could be used, such as `let` > (introduces a new keyword, possibly confusing as it wouldn't imply scope > restrictions like in other languages) or `var` (no new keyword, but could > still be confusing and it's not obvious which side should get it), but ^ = is > probably the only single-character option. > > So, question: > > 1. Are you OK with the current Ruby-inspired syntax? ($a means bind, ^$b > means pin.) > 2. If not, have you a counter-proposal that would garner consensus? > > > Thanks all. > > [1] > https://github.com/Crell/php-rfcs/blob/master/pattern-matching/research.m= d > > -- > Larry Garfield > larry@garfieldtech.com > Hi, Larry! First of all, I'm very excited about your Pattern Matching RFC and looking forward to it. > Because PHP uses only an LL(1) parser Are there any plans to upgrade the parser to bypass these limitations? I remember Nikita shared some thoughts on why this is not trivial in https://wiki.php.net/rfc/arrow_functions_v2. Maybe something has changed since then? --=20 Valentin --000000000000b2df27062d758c07 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, 4 Feb 2025=E2=80=AFat 07:35, Larr= y Garfield <larry@garfieldtech= .com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">Hi folks.=C2=A0 Ilija is s= till working on the implementation for the pattern matching RFC, which we w= ant to complete before proposing it officially in case we run into implemen= tation challenges.

Such as these, on which we'd like feedback on how to proceed.

## Object property patterns

Consider this code snippet:

class C {
=C2=A0 =C2=A0 public $prop =3D 42 is Foo{};
}

The parser could interpret this in multiple ways:

* make a public property named $prop, with default value of "the resul= t of `42 is Foo`" (which would be false), and then has an=C2=A0 empty = (and therefore invalid) property hook block
* make a public property named $prop, with default value of "whatever = the result of `42 is Foo {}`" (which would be false).

Since the parser doesn't allow for ambiguity, this is not workable.=C2= =A0 Because PHP uses only an LL(1) parser, there's no way to "dete= rmine from context" which is intended, eg, by saying "well the ho= ok block is empty so it must have been part of the pattern before it."=

In practice, no one should be writing code like the above, as it's need= lessly nonsensical (it's statically false in all cases), but the parser= doesn't know that.

The only solution we've come up with is to not have object patterns, bu= t have a property list pattern that can be compounded with a type pattern.= =C2=A0 Eg:

$p is Point & {x: 5}

Instead of what we have now:

$p is Point{x: 5}

Ilija says this will resolve the parsing issue.=C2=A0 It would also make it= possible to match `$p is {x: 5}`, which would not check the type of $p at = all, just that it's an object with an $x property with value 5.=C2=A0 T= hat is arguably a useful feature in some cases, but does make the common ca= se (matching type and properties) considerably more clunky.

So, questions:

1. Would splitting the object pattern like that be acceptable?
2. Does someone have a really good alternate suggestion that wouldn't c= onfuse the parser?

## Variable binding and pinning

Previously there was much discussion about the syntax we wanted for these f= eatures.=C2=A0 In particular, variable binding means "pull a sub-value= out of the matched value to its own variable, if the pattern matches."= ;=C2=A0 Variable pinning means "use some already-existing variable her= e to dynamically form the pattern."=C2=A0 Naturally, these cannot both= just be a variable name on their own, as that would be confusing (both for= users and the engine).

For example:

$b =3D '12';

if ($arr is ['a' =3D> assign to $a, 'b' =3D> assert i= s equal to $b]) {
=C2=A0 =C2=A0 print $a;
}

Based on my research[1], the overwhelming majority of languages use a bare = variable name to indicate variable binding.=C2=A0 Only one language, Ruby, = has variable pinning, which it indicates with a ^ prefix.=C2=A0 Following R= uby's lead, as the RFC text does right now, would yield:

$b =3D '12';

if ($arr is ['a' =3D> $a, 'b' =3D> ^$b]) {
=C2=A0 =C2=A0 print $a;
}

That approach would be most like other languages with pattern matching.

However, there is a concern that it wouldn't be self-evident to PHP dev= s, and the variable binding side should have the extra marker.=C2=A0 Ilija = has suggested &, as that's what's used for references, which wo= uld result in:

$b =3D '12';

if ($arr is ['a' =3D> &$a, 'b' =3D> $b]) {
=C2=A0 =C2=A0 print $a;
}

There are two concerns with this approach.

1. The & could get confusing with an AND conjunction, eg, `$value is in= t & &$x` (which is how you would bind $value to $x iff it is an int= eger).
2. In practice, binding is almost certainly going to be vastly more common = than pinning, so it should likely have the shorter syntax.

There are of course other prefixes that could be used, such as `let` (intro= duces a new keyword, possibly confusing as it wouldn't imply scope rest= rictions like in other languages) or `var` (no new keyword, but could still= be confusing and it's not obvious which side should get it), but ^ is = probably the only single-character option.

So, question:

1. Are you OK with the current Ruby-inspired syntax?=C2=A0 ($a means bind, = ^$b means pin.)
2. If not, have you a counter-proposal that would garner consensus?


Thanks all.

[1] https://github.com/Cre= ll/php-rfcs/blob/master/pattern-matching/research.md

--
=C2=A0 Larry Garfield
=C2=A0 larry@ga= rfieldtech.com

Hi, Larry!

First of all, I'm very <= /span>excited about your Pattern Matching RFC and looking forward to it.

> Because PHP uses only an LL(1) parser

Are there any p= lans to upgrade the parser to bypass these limitations? I remember Nikita s= hared some thoughts on why this is not trivial in=C2=A0https://wiki.php.net/rfc/arrow_function= s_v2. Maybe=C2=A0something=C2=A0has changed since then= ?

--
=
Valentin
--000000000000b2df27062d758c07--