Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126317 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 66E141A00BC for ; Fri, 7 Feb 2025 00:35:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1738888374; bh=7fMdOwxeu85tOZMrg65GVJ7vWWdU9wPJnRELsS3yiaA=; h=References:In-Reply-To:From:Date:Subject:To:From; b=b6KXvvXul1JoCtadKWSi8DkQ2OmsPNLU3KK6ehlWWv/lyHJI7gMuRNH+jW1nQHVIP jkI57T/wtRMzyvkm6SisdkM0C1pnKR4P6Jh4keL+D1lyPNqsv31hAko8efdv9fx0m0 jNXqM2tDukNSH6Lul64raMPP3qOVw3ANfUBhp6hMjk8M7+QWLN1CVVT0r+5eU82hgl FHAA/rFust+ig69yaBNF0ZWcDzvrxgUvXAvbG6qwZCDrZL70LXQE30xHjB0Ir+aE2F ppVNDW42MBfZrliyg71b9OuLIfEsloR/NBVTmUjgG58rR5VwgABcajmJ+YYxg0XYrk A8EoRph89t00A== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 5395F18004A for ; Fri, 7 Feb 2025 00:32:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.2 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 7 Feb 2025 00:32:53 +0000 (UTC) Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-6e440e64249so10414956d6.3 for ; Thu, 06 Feb 2025 16:35:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1738888537; x=1739493337; darn=lists.php.net; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=kfNb3ltPCisuiGm3UYxV5ZRoekndK3B/NjmkyW+3oEw=; b=SQg7IAsN5he3vrwbLQ0xn9Pe3Fz7UmNFGDS7SvOq4F9Lty/nxJEM3BlnnWMXE6IEmY rNJLxXzF26Ji5T67t0Fx8XtTcT4Z3Ay0s5aKqdoLBYsdLfA8c8Vu/w8EvSsEAoMxNavJ +N+VWI5VhGyo1VY5xU6+Je5qYe/5wR7wnYHPTuy8B4TeMmxQ3nI2jsAnSDf/6/jbJtED WAAYxVUwGHX4JnkbCdQOQPpbWsRxxbpKfEqgfxGI1mb+vqTXlmA6CT8xMpDvUZPsRXOG jMPCc4LNW9QV3j2ha0QTL0/IMgMclWBsy5ik8/vVbahLHwyRE6UC88R/0PBGYhbTZSR6 gL4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738888537; x=1739493337; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kfNb3ltPCisuiGm3UYxV5ZRoekndK3B/NjmkyW+3oEw=; b=gXpQu05e3+9fE7iPLS7CJn1adm2fwRgCNIuq560XzeRY+BlTHAFCRqgcKmI08Rq9dk /p72XRNA0apN334icQU9z+10dpp7IL9OIcWewu2PmpYi8XEBuHyZNX9Y1nh9yYcx8TaB XKJfh3NKmn4vTGxTpnyr7TYI1Q53geTSXLjSE4Z/W/5sbbyLewy9r7UK/WADro3u/aNy Kq07Av20eP7et4eD/mLff4CjmoAR3Lig9N+5sv8CxhUOhBscjGLTv5D+5rvbAB4H1ITU QfMPET1afDmKWHg6j9481PfQxHUQTKFzfKHN8kjqXwqx+X7fLkjb+bkpEaSfvt7IH2P6 v9Sw== X-Gm-Message-State: AOJu0YwXqwQE5FYc/6ehKYdaQwBRvnbnWODlVhFDzLqWqGavQe0dH/nF H3NFh7NHsRn+ChrZQL0yQPQ5Y5tc6arthqm0mDxhW/uZkwu1QifP950/SrH1/MB1YTUKwAEkvf7 +Wd4E7YNZwBQNpIjmE8/cGv1nOrtLLdin2FbKmw== X-Gm-Gg: ASbGncsty7sWz6Id3ssQgOKVH0QaCFmoSuV2+btPvjpyhBMHihYwcglTMqDZdfHvLXM aDsdd72cPMd1G+NAvLFbV5QAgQ6aKP9wvCRppp3OBxMVwFu32wXM64NTQT+spPBbGpiB61VZAJJ bLnTGajbE4zwuZ2eg27UuNQtcc1A== X-Google-Smtp-Source: AGHT+IEJQV3skanSa5lV+/id9GiatbrvjhrhYujC6nuDfMkT1Nk7SC8BMIaOpsg+jmfXSgDnKFHr2gkNDUd8BiQPRBM= X-Received: by 2002:a05:6214:5098:b0:6d8:af2d:2a44 with SMTP id 6a1803df08f44-6e4456d2522mr17563586d6.32.1738888537432; Thu, 06 Feb 2025 16:35:37 -0800 (PST) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: <4eeed1bb-d039-4041-b399-056d7234bd91@app.fastmail.com> <5fb9efdb-deac-49a8-b9f2-4deb86588879@gmx.de> In-Reply-To: <5fb9efdb-deac-49a8-b9f2-4deb86588879@gmx.de> Date: Fri, 7 Feb 2025 01:35:25 +0100 X-Gm-Features: AWEUYZkgNBNWAIHgQq7uQ0GJvDzVqpoVv1NlIjkqVHP5-85iFUejfxs7dKJJPrU Message-ID: Subject: Re: [PHP-DEV] Pattern matching details questions To: php internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: tovilo.ilija@gmail.com (Ilija Tovilo) On Thu, Feb 6, 2025 at 10:43=E2=80=AFPM Christoph M. Becker wrote: > > On 06.02.2025 at 20:24, Larry Garfield wrote: > > > On Thu, Feb 6, 2025, at 3:05 AM, Valentin Udaltsov wrote: > > > >> Are there any plans to upgrade the parser to bypass these limitations? > >> I remember Nikita shared some thoughts on why this is not trivial in > >> https://wiki.php.net/rfc/arrow_functions_v2. Maybe something has > >> changed since then? > > > > I'm not aware of any plans to change the parser. That would be a rathe= r dramatic and invasive change. > > There have been ideas to use some more powerful features of bison[1], > like GLR, so that would not necessarily be a drastic and invasive > change. I'm not aware of any concrete plans, and these more powerful > features are not without downsides. I don't think there's a big incentive to switch to a GLR parser right now. First off, I don't believe it actually solves the ambiguity problem we've described in this thread (`class C { public $prop =3D 42 is Foo{}; }`), which is not limited by lookahead, but is a full blown syntax ambiguity. *Technically* it could be solved in our current LALR(1) parser by duplicating the expr production, removing pattern matching in this production and using it solely for property initializers, but this is a bad long term solution. Secondly, single lookahead grammars are easier for machines and humans to understand. Unfortunately, it's hard to predict future syntax changes, but I believe we have managed to find acceptable compromises so far. It's worth noting that some newer languages also strive to avoid +1 lookahead grammars. As an example, see Rust's turbofish syntax (e.g. `Vec::`), used for generics in the general expression context to avoid confusion with `<` lower than comparison. Also worth noting: Switching to a GLR parser might cause a significant amount of work for nikic/PHP-Parser, which is based on ircmaxell/php-yacc, which can only generate LALR(1) parsers. It might cause even more problems for token-based tools. Sticking with the generics example, `[bar < Bar, Baz > ()]` will require a lot of scanning to understand whether to remove the spaces between bar and `<`. The `::<` turbofish syntax on the other hand immediately indicates generics. Anyway, it seems we have slightly gone off-topic. :) Ilija