Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:123580 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id ADB791A009C for ; Tue, 11 Jun 2024 14:13:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1718115262; bh=T7S8VW7EHu3QErSPVD7NLYDU3c2QlTuboj6friXvOE8=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=KE8rEXhDO73AulR9cBApsByHypgUbfuXrZLFDcA/ruVv2o79WI5zlXs2mFiRs6L5v XlIPNpf+CW0HaGqrm0sbwAn/7XeL8OGXCMXZgeRlhb9B4XKRtXLTOFl+eXj3SMW3mu 2F0mTJGxJha/28XjgKNyCDHSsUHM+VisU/gDomiiYyDOfFnzpDzPdBVlE6OyQh/6z+ L+YlkApR1WlsRXMwiBJLtPvqeVWD9rauDeYCEPB1DrQoKylltDHJwbHmQadgLZPIU+ guWGOSSn35tDIXGoN1qKhtcDjE9588u1nUYQDo+7qvLjNFNyedmnZSkCuI08gHuz+k FTbcqNdjoyoyg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8E32D1807EF for ; Tue, 11 Jun 2024 14:14:21 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-io1-f49.google.com (mail-io1-f49.google.com [209.85.166.49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 11 Jun 2024 14:14:21 +0000 (UTC) Received: by mail-io1-f49.google.com with SMTP id ca18e2360f4ac-7eb12b2bf78so245428139f.0 for ; Tue, 11 Jun 2024 07:13:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datadoghq.com; s=google; t=1718115192; x=1718719992; darn=lists.php.net; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=T7S8VW7EHu3QErSPVD7NLYDU3c2QlTuboj6friXvOE8=; b=bPKzbriKYUbf3NdTZFUDkdwdBW3tRWzVCTl64eXmaw9ogg+9wilWRSnxNfO31rx9wS LJ6ck36xf2n++d92vsP+/eGClbM3fReemxiWZMgGBmEVpZJEmctJhaAdl6r//h3giOvP 4LjUDlwSkRt/1Pa1JNpuWAdaLrNChRYVveG+0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718115192; x=1718719992; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T7S8VW7EHu3QErSPVD7NLYDU3c2QlTuboj6friXvOE8=; b=tTBN6vsDHXqg4q1FonXbKasMGuZVUNe6SkwKDZVxacvhwju8Q1W/O91d3bHdnf985c rHwZaecH0zlOPJlAgf6YeHV6/La5xFIcPmkz6lXJVqwcUPwbm6L6vwd/RarYU+9S9waV /VLjEQzho7Frhz/Hz4btmMIvYuRbo6FxL9UKE4inoZPH0vypUu4e1rIJQC8Zn7USnyE5 5LJdjl1LD68XliFQTwdlwzjUARyBJ2Po38Bzh/MdoDzzTDPO3vjbNFJmQOlSdb6SGbUF 9Cr8Gjv4CLQmm5xPHdyHSVHEX40egbPn7m3PbBLnuRMJ4KYh6++hT/Pwb3c0wLiXO/Tp nmcw== X-Forwarded-Encrypted: i=1; AJvYcCWT5u3vmrAalsoyPsHCFj38rDbilRrvwPALLwSp0V6CDVJCT/tpTq9/9lPTcT2ujYOoDJvEp1mjijcb0Jo9jJnFn7BvoDqlvg== X-Gm-Message-State: AOJu0YwOisQZW43eL97ZoUSpE3pa1R2sMxmASDF6HG69uMYRWWAeGsB7 IF37mkp+ACTgdr8ItCGfKRu03tOjrTgPW+jdXukx9YEQnIYAmvl9d5hJSHlEK4UzBQlEvfAushd lvkBYsAX5xEUlXFjAAI+tnYG2wtTgSZpBvPaiDg== X-Google-Smtp-Source: AGHT+IH6HVIuThs/UiGIPiuGxVDlDsQBpJjniUFHxA/7f3znl9pqGF4UlIWF4GSlVvaTd7s6gbokkmGNa0Xbi9Exxa0= X-Received: by 2002:a05:6602:60ce:b0:7eb:708a:3264 with SMTP id ca18e2360f4ac-7eb708a3721mr1113076139f.10.1718115192242; Tue, 11 Jun 2024 07:13:12 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 11 Jun 2024 08:13:01 -0600 Message-ID: Subject: Re: [PHP-DEV] Revisiting case-sensitivity in PHP To: Ben Ramsey Cc: Valentin Udaltsov , php internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: levi.morrison@datadoghq.com (Levi Morrison) On Mon, Jun 10, 2024 at 9:40=E2=80=AFPM Ben Ramsey wrote: > > > On Jun 10, 2024, at 20:35, Valentin Udaltsov wrote: > > > > Hi, internals! > > > > 9 years have passed since the last discussions of case sensitive PHP: h= ttps://externals.io/message/79824 and https://externals.io/message/83640. > > Here I would like to revisit this topic. > > > > What is case-sensitive in PHP 8.3: > > - variables > > - constants (all since https://wiki.php.net/rfc/case_insensitive_consta= nt_deprecation) > > - class constants > > - properties > > > > What is case-insensitive in PHP 8.3: > > - namespaces > > - functions > > - classes (including self, parent and static relative class types) > > - methods (including the magic ones) > > > > Pros: > > 1. no need to convert strings to lowercase inside the engine for name l= ookups (a small performance and memory gain) > > 2. better fit for case sensitive platforms that PHP code is mostly run = on (Linux) > > 3. uniform handling of ASCII and non-ASCII symbols (currently non-ASCII= symbols in names are case sensitive: https://3v4l.org/PWkvG) > > 4. PSR-4 compatibility (https://www.php-fig.org/psr/psr-4/#:~:text=3DAl= l%20class%20names%20MUST%20be%20referenced%20in%20a%20case%2Dsensitive%20fa= shion) > > > > Cons: > > 1. pain for users, obviously > > 2. a backward compatibility layer might be difficult to implement and/o= r have a performance penalty > > > > On con 1. I think today PHP users are much more prepared for the change= : > > - more and more projects adopted namespaces and PSR-4 autoloading via C= omposer that never supported case-insensitivity (https://github.com/compose= r/composer/issues/1803, https://github.com/composer/composer/issues/8906) w= hich forced to mind casing > > - static analyzers became more popular and they do complain about the w= rong casing (see https://psalm.dev/r/fbdeee2f38 and https://phpstan.org/r/1= 789a32d-d928-4311-b02e-155dd98afbd4) > > - Rector appeared (it can be used to automatically prepare the codebase= for the next PHP version) > > > > On con 2. While considering different transition options proposed in pr= ior discussions (compilation flag, ini option, deprecation notice) I stumbl= ed upon Nikita's comment (https://externals.io/message/79824#79939): > > May I recommend to only target class and class-like names for an initia= l RFC? Those have the strongest argument in favor of case-sensitivity given > > how current autoloader implementations work - essentially the case-inse= nsitivity doesn't properly work anyway in modern code....I'd also appreciat= e having a voting option for removing case-insensitivity right away, as opp= osed to throwing E_STRICT/E_DEPRECATED. If we want to change this, I person= ally would rather drop it right away than start throwing E_STRICT warnings = that would make the case-insensitive usage impossible anyway. > > It makes a lot of sense to me: a fairly simple change in the core and n= o performance penalty. At the same time, a gradual approach will reduce the= stress. > > > > So the plan for 8.4 might be to just drop case insensitivity for class = names and that's it... Let's discuss that! > > > I=E2=80=99m not saying I agree with or support this, but I think your pro= posal has a better chance of being accepted if you target PHP 9.0 instead o= f 8.4. > > Cheers, > Ben > In fact, it's definitely a BC break I would not personally vote for in 8.4. This isn't some minor thing squirreled away in a library--this is the core language, with wide impact. For this reason, I believe it should target 9.0. I will happily vote for this feature, as long as the patch is reasonable. The most obvious implementation is not very good, though. The engine uses lowercase names for case insensitivity. Namespaces are embedded into the type names. To lowercase the namespace but not the type name, one could do a reverse scan for a namespace separator on the type name, and then lowercase from the start to the index of the namespace separator. For example, " Psr\Log\LoggerInterface" needs to become "psr\log\LoggerInterface". The problem with this is that it's not really going to save CPU nor memory because it still has to lowercase the namespace. We could refactor the engine to store the namespace separately from the type name. This is a lot more work and will increase the size of some types, which might be difficult at a technical level. I can't think of other implementations right now. If nobody can come up with a better implementation, I think we should consider going with split-sensitivity on namespaces where it matches the sensitivity of the thing it is attached to. A namespaced class would have a case sensitive namespace but a namesped function would still have a case insensitive one.