Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:130444 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id EC7331A00BC for ; Wed, 25 Mar 2026 08:23:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1774427038; bh=/m++j9/egW1JPs8Ccx8pfKK8AXh+RAWmbDegdzhXeAI=; h=References:In-Reply-To:From:Date:Subject:To:From; b=BjGqs7Qiv4397sJKZVOOqmWRJaa25ohU4fzThTdl7EsaCoWoH2ufL0p2ZpqWzOz1+ x6hE6ADpD9oB1aM8t+4RtDxqC/i//6NThgei+UexaNOp4T1EqKUC+2Ap6Q5cUjJktw bG5p7q8ozqnApOlo6u5FJriEOhb742xI+emBi/VQJ1XfGERb3qnVc8FKn/p2TPNhm+ DmaEcYBlhsOTUfDvwF/emOQ//93RljtABbJNkpEZYbN5dHHy8ks12NjlM0HXgmjJVi JDxZN9XR9z5TJukzb10cKvVW9ejPT0XzNRM/C/372R+HE+9T3HCPbSuwYPSqRTxfS8 80KqNjLhMvzNw== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id B72671801E8 for ; Wed, 25 Mar 2026 08:23:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.8 required=5.0 tests=BAYES_50,DMARC_MISSING, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from developer-rob-server01.developer-rob.co.uk (mail.developer-rob.co.uk [35.176.203.165]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 25 Mar 2026 08:23:57 +0000 (UTC) Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) (Authenticated sender: contact@developer-rob.co.uk) by developer-rob-server01.developer-rob.co.uk (Postfix) with ESMTPSA id 735B188D35 for ; Wed, 25 Mar 2026 08:23:51 +0000 (GMT) Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-89a14be4733so24669296d6.2 for ; Wed, 25 Mar 2026 01:23:51 -0700 (PDT) X-Gm-Message-State: AOJu0Yx0izEWnT8lmq3UC5lNLIDjX9E0WvcE46/3ZarEInWBnishqh66 RFCLSUHaHyjWa+jz/FzV5xqqo4jY2IJ22e11hzyKsimoqhcnV6v4265WaDMVQyxrjwviPizJsUz 1+SX9/cdlN+8NkD2QGA8GRCj7ulRhuEQ= X-Received: by 2002:a05:6214:301d:b0:89c:bcc0:b709 with SMTP id 6a1803df08f44-89cc4a8e56emr39269726d6.33.1774427030167; Wed, 25 Mar 2026 01:23:50 -0700 (PDT) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <3f4f6959.eaf.19cf0276cd8.Coremail.lamentxu@163.com> <34d24237.5453.19d23a29ea4.Coremail.lamentxu@163.com> In-Reply-To: <34d24237.5453.19d23a29ea4.Coremail.lamentxu@163.com> Date: Wed, 25 Mar 2026 08:23:38 +0000 X-Gmail-Original-Message-ID: X-Gm-Features: AaiRm51XEA9YNCIxp5_ItStxjuTKB0cTsNEFCZOfhpcdQ4TRP7PZPZUELzjOiHY Message-ID: Subject: Re: Re: [PHP-DEV] [RFC] Remove \0 from default trim() character mask To: "internals@lists.php.net" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable From: contact@developer-rob.co.uk (Robert Humphries) > Reasons for supporting > - semantically NUL is not whitespaces > - the majority of other popular languages don't trim NUL > Reasons for not supporting > - Java do trim NUL > - Security issues in existing code base > - Already has mb_trim() and the second parameter instead to prevent trimm= ing NUL if people want > - Unnecessary changes in the life-cycle I think it would be useful if there were some examples of when you would want to be using `trim` but _not_ trim NULL bytes. The examples in the RFC currently show the expected change in behaviour; which is good - but you could also achieve the same effect by not running `trim` in the first place, as the only character in the examples that is expected to be removed before or after the change is the NULL byte (even in the example with a new line followed by null bytes, after the change then the string would be identical to before the `trim`). Given that most voters seem to be not strongly against, but also seeing no benefit in changing the status quo, some examples of how the change being used would be useful might help. ~ Robert On Wed, Mar 25, 2026 at 6:15=E2=80=AFAM LamentXU wrote: > > I think there are sound opinions in both side so I will still let the vot= e begin and see what the majority thinks. To be short, > > Reasons for supporting > - semantically NUL is not whitespaces > - the majority of other popular languages don't trim NUL > Reasons for not supporting > - Java do trim NUL > - Security issues in existing code base > - Already has mb_trim() and the second parameter instead to prevent trimm= ing NUL if people want > - Unnecessary changes in the life-cycle > > This is a quite minor change (and thats why people don't talk about this = before, since little people run into the case of trimming NUL). > > Well my opinion is, first I think trimming is indeed for white-spaces. I = know Java do trim NULs, but it doesn't explicitly do that, it removes every= char with ascii <=3D 20 (and I think most people are using strip() instead= , which doesn't remove NUL), besides almost every other standard or languag= e don't trim NUL. So in the case of aligning with popular standards or lang= uages it make sense to avoid trimming NUL. > > Security and life-cycle concerns are good points. Un-trimming NUL may cau= se a sort of path hacking as Ilia mentioned, while php trimming \0 is alrea= dy well-known among some php devs who ran into this case before. > > We has a second character to alter the trimmed char set, but I do think m= ost people would expect it not to be trimmed by default aligning with other= languages. > > At 2026-03-25 03:15:25, "Ilia" wrote: > > That seems a bit dangerous, since non-stripped \0 can allow it to potenti= ally lead to issues because when concatinated with other strings, which is = quite common for string operations can result in un-predictability and poss= ibly even security issues. > > You make a good point about other languages, the concern is while there t= hat is the expecation and different solutions exist for sanitizing/handling= \0 they are well known and understood, in PHP the assumption is that \0 is= removed and the change of this assumption breaks a lot of things. > > Just my 2c. > > On Sun, Mar 15, 2026 at 2:23=E2=80=AFAM LamentXU wrote= : >> >> Dear all, >> >> I am sending this to introduce my new RFC: https://wiki.php.net/RFC/dont= _trim_NUL >> >> Quick summary: >> >> Currently, PHP's trim functions strip the NUL byte (\0) by default, trea= ting it alongside spaces, tabs, and newlines. This creates a highly surpris= ing edge case. >> >> Because \0 is semantically a control character or a vital part of a bina= ry payload rather than a typographical whitespace character, casually using= trim() to clean up trailing newlines can silently corrupt binary streams o= r cryptographic hashes by stripping legitimate NUL bytes. Whitespace charac= ters are intended for typographical spacing and formatting (e.g., spaces, n= ewlines, tabs). >> >> Also, almost every mainstream programming languages except PHP doesn't t= rim NUL characters (python, go, rust, js, even 'is_space' function in glibc= ...) It sounds reasonable to expect the same here. >> >> This RFC proposes removing \0 (ASCII 0) from the default character mask.= I recognize this introduces a backward compatibility break, and therefore = I would love to hear your thoughts, feedback, and any concerns regarding th= e BC impact before moving forward. >> >> Cheers, >> Weilin Du > > > > -- > Ilia Alshanetsky > Technologist, CTO, Entrepreneur > E: ilia@ilia.ws > T: @iliaa > B: http://ilia.ws