Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126277 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id BCC671A00BC for ; Tue, 4 Feb 2025 04:31:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1738643350; bh=RzKD93rathe3lFYkmJI/x433QZPCNq0Kk/tg9BAVXtI=; h=Date:From:To:Subject:From; b=jLWNJbDBydrO6urVHjVIgzB8wT1qjm4C62cihuB/ww3cuRE0HoC8nuhrBkcI9w7oc vuk+8RfvXpLuLRuoUkk4tpKsVn7VcJih7klLkuKNXOWHQc82XRJ2aeX1iTkliBnlJf Hc0yODWwtRSvGoOBVLQLWrfankImoyRmRaSCSVXEtT6UDssNwCo9FpZT0nYLE9wLtA Fd+znrPu8nI96gS93Jakxjz8gdSXVHq+RV0bUAWMBBTpId/NlTSFfHXlKvccNFMzQO dEdBKanTIoG3pTnaqMJhyBuXBhvdOXGzrTvzTs9SvAxAoeovxkks/uJb4Rf6HvsHaP f8Bf9O1Ajob3g== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id AEDEE18004A for ; Tue, 4 Feb 2025 04:29:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fhigh-a2-smtp.messagingengine.com (fhigh-a2-smtp.messagingengine.com [103.168.172.153]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 4 Feb 2025 04:29:07 +0000 (UTC) Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfhigh.phl.internal (Postfix) with ESMTP id BFAC111400B2 for ; Mon, 3 Feb 2025 23:31:52 -0500 (EST) Received: from phl-imap-06 ([10.202.2.83]) by phl-compute-01.internal (MEProxy); Mon, 03 Feb 2025 23:31:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:message-id :mime-version:reply-to:subject:subject:to:to; s=fm3; t= 1738643512; x=1738729912; bh=PJByyPjPSiwcE4kTT27wKkpMfcPj6FyzeMr Cm8a7tac=; b=prlwgLgcPQ8cFYMNHKLdew4bKSQ2pg4JJP0YNj4UdlvXtO5iR5K 8r6eEDbjzftQn6E41zbmUB9j3yHoqnUIaC/xvQ/6gAdpB4BXyGCErU4E1GVla2GM 2j0PAkDQSd0hM8S7m/G4zduJUiROakdti1WmOeVE6MhihNem77NyY+nCfI5CVVUw Klt1zDsB4XIKfz5LtYtwWJ8KzaIMw9wH8GhUeSci/c0gQkA48FQhBEpUmgBYbZZA WfYfxHD1rrcHQQI+l+dISVRiUemyzDMX4dGaQICoW9cBkArOVZnvJXS+BMhfcMUD BAzmhI3SkWzuMIw8TJQeKn6twySQjGo0sXA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:message-id:mime-version:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1738643512; x=1738729912; bh=PJByyPjPSiwcE4kTT27wKkpMfcPj6FyzeMr Cm8a7tac=; b=duaaz38Ft/PKMAVQLiMYwts695ESQFiT0IVaO0AGgCEmvyaFWqI 2s5Y9T6wMhCmbyF+Z7JVs/QMnvkK28r+Z7E7x/zlayf27NVI2jh1gzFogPhsG+ES OG7reHVerOmySqCITo36qXRuJ14z6Qj/hrULmrLaap54FgiyjrDQEoG/yTRiAHew ZV848Y0wULuv722fLuAkTDi/6tJBkp3Lm7TMHdhxa25hSFrYldfNbx2yflBWrlC+ B00pGR1Nr/zaU3Gc5q7SzM2Ihi/XP7qPxosMS3uGbJbQwOUItmRxezL721Gidyc7 B829ZdLy+D2VeFmDOQIf7iK074idqZf1CAQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefvddrtddtgdduleehhecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivg hnthhsucdlqddutddtmdenucfjughrpefoggffhffvkffutgfgsehtjeertdertddtnecu hfhrohhmpedfnfgrrhhrhicuifgrrhhfihgvlhgufdcuoehlrghrrhihsehgrghrfhhivg hlughtvggthhdrtghomheqnecuggftrfgrthhtvghrnhepheeftefhvefhleeugedtvefg tdeuveffudehgfetiedtjeetgeelgeegkeelgeeinecuffhomhgrihhnpehgihhthhhusg drtghomhdprhgvshgvrghrtghhrdhmugenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehlrghrrhihsehgrghrfhhivghlughtvggthhdrtghomh dpnhgspghrtghpthhtohepuddpmhhouggvpehsmhhtphhouhhtpdhrtghpthhtohepihhn thgvrhhnrghlsheslhhishhtshdrphhhphdrnhgvth X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 471C929C006F; Mon, 3 Feb 2025 23:31:52 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Mon, 03 Feb 2025 22:31:31 -0600 To: "php internals" Message-ID: Subject: [PHP-DEV] Pattern matching details questions Content-Type: text/plain Content-Transfer-Encoding: 7bit From: larry@garfieldtech.com ("Larry Garfield") Hi folks. Ilija is still working on the implementation for the pattern matching RFC, which we want to complete before proposing it officially in case we run into implementation challenges. Such as these, on which we'd like feedback on how to proceed. ## Object property patterns Consider this code snippet: class C { public $prop = 42 is Foo{}; } The parser could interpret this in multiple ways: * make a public property named $prop, with default value of "the result of `42 is Foo`" (which would be false), and then has an empty (and therefore invalid) property hook block * make a public property named $prop, with default value of "whatever the result of `42 is Foo {}`" (which would be false). Since the parser doesn't allow for ambiguity, this is not workable. Because PHP uses only an LL(1) parser, there's no way to "determine from context" which is intended, eg, by saying "well the hook block is empty so it must have been part of the pattern before it." In practice, no one should be writing code like the above, as it's needlessly nonsensical (it's statically false in all cases), but the parser doesn't know that. The only solution we've come up with is to not have object patterns, but have a property list pattern that can be compounded with a type pattern. Eg: $p is Point & {x: 5} Instead of what we have now: $p is Point{x: 5} Ilija says this will resolve the parsing issue. It would also make it possible to match `$p is {x: 5}`, which would not check the type of $p at all, just that it's an object with an $x property with value 5. That is arguably a useful feature in some cases, but does make the common case (matching type and properties) considerably more clunky. So, questions: 1. Would splitting the object pattern like that be acceptable? 2. Does someone have a really good alternate suggestion that wouldn't confuse the parser? ## Variable binding and pinning Previously there was much discussion about the syntax we wanted for these features. In particular, variable binding means "pull a sub-value out of the matched value to its own variable, if the pattern matches." Variable pinning means "use some already-existing variable here to dynamically form the pattern." Naturally, these cannot both just be a variable name on their own, as that would be confusing (both for users and the engine). For example: $b = '12'; if ($arr is ['a' => assign to $a, 'b' => assert is equal to $b]) { print $a; } Based on my research[1], the overwhelming majority of languages use a bare variable name to indicate variable binding. Only one language, Ruby, has variable pinning, which it indicates with a ^ prefix. Following Ruby's lead, as the RFC text does right now, would yield: $b = '12'; if ($arr is ['a' => $a, 'b' => ^$b]) { print $a; } That approach would be most like other languages with pattern matching. However, there is a concern that it wouldn't be self-evident to PHP devs, and the variable binding side should have the extra marker. Ilija has suggested &, as that's what's used for references, which would result in: $b = '12'; if ($arr is ['a' => &$a, 'b' => $b]) { print $a; } There are two concerns with this approach. 1. The & could get confusing with an AND conjunction, eg, `$value is int & &$x` (which is how you would bind $value to $x iff it is an integer). 2. In practice, binding is almost certainly going to be vastly more common than pinning, so it should likely have the shorter syntax. There are of course other prefixes that could be used, such as `let` (introduces a new keyword, possibly confusing as it wouldn't imply scope restrictions like in other languages) or `var` (no new keyword, but could still be confusing and it's not obvious which side should get it), but ^ is probably the only single-character option. So, question: 1. Are you OK with the current Ruby-inspired syntax? ($a means bind, ^$b means pin.) 2. If not, have you a counter-proposal that would garner consensus? Thanks all. [1] https://github.com/Crell/php-rfcs/blob/master/pattern-matching/research.md -- Larry Garfield larry@garfieldtech.com