Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:127829 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 79E4C1A00BC for ; Tue, 1 Jul 2025 21:27:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1751405137; bh=S5kmsUuhbKxXb4x77vvkmrSwUubG0i0vKttKiIkuvxs=; h=References:In-Reply-To:From:Date:Subject:To:From; b=hf0bLkQgaANrt0TmfqZbpP/AvFHHuAb/fnSS3OQ37/UIcaP3xfSgQ/rbj49ScCP7q gfL9GIbPfGQUhEHGloEMMWp2eF9d+YToLZS4NpPJnURMbPP8CtYEBi2N2u+F+ts0Pt cnhMLEXO+fccy84PybOj5pbIdpgwvouoxxx7rbsOT6V7oSP69pTmZkX9yufFs1ECOx y9zx3aNEYquTyFvC3pvKzn9xnB2Je99EbG+ArNmgIEM561X3uhz9NbXDXgNQm8vNK6 R2PPD/+J+/m6uzO6uyXeRgW6xEIknrlwZoI8GZMQO6SFV57jn1TMNhNIZkufkdmk/f 3xxIAWEogL7fg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 1933D180339 for ; Tue, 1 Jul 2025 21:25:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: Error (Cannot connect to unix socket '/var/run/clamav/clamd.ctl': connect: Connection refused) X-Envelope-From: Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 1 Jul 2025 21:25:35 +0000 (UTC) Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-3121aed2435so5703149a91.2 for ; Tue, 01 Jul 2025 14:27:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751405248; x=1752010048; darn=lists.php.net; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=S5kmsUuhbKxXb4x77vvkmrSwUubG0i0vKttKiIkuvxs=; b=lfplLTI1Vadlb+YTMtasLlEx3CkyiJqPvqxv5nvwc5s/fjSSptki+vbhaex9kReS+c wTa2WPVvzCObOUL0QMi8CHG6i/I9oYKFbJW6iDCae+BNfDoYdTwwXJIqv+IL7Edg/SCH SxVz5Fuc7carHwaV7mAiqTF6cPnxRti3wGEoOLYaMmigLoJBkMl5I3R6RdAhfNDQ++xi zonlDEVYv8jShgh6Tr/+SJqNlDSwqJa2Dem/FXGQ8qbyRKxHfSg+4Xk7dGNyTqsQnGkc v+d9BVhFkXzCtj4XB92cG/lr0Ndd2lA/yKc3AnCLtAlpxxDJd6w7f8jYHDg6Fkmq8CcW BAiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751405248; x=1752010048; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=S5kmsUuhbKxXb4x77vvkmrSwUubG0i0vKttKiIkuvxs=; b=p0wqCVd39mgF6InGQxbIvss/0FD22EkfJcGc0w9Z/HCTnvkiK9FL1FwC4UAthuiTM6 IUvTLniOi5ICfQnGPg9TSCUD1ymn+xkuF91KdCWAAth3T2GWrdikqTPlirIQzzVZ/yup YgQlDJA3lI2C8s2VdsMxG04rnOId0Bh2Q3/0EJtefsiGhQ1fYTZoBBNQGSnTw89weIGY Xrtv5C/VDQnBfQSyaq4spyDQWP4a9l3I4YQ2Ww/1+XxpbqGQ73V6j0mgW9LMHwhNYV1n eNwnw3sR6E+Ui1l34bmb63sB/Xshp5YFAAmi5lOUE8+GuD7yQO4/OSdefqYZay7jTdts F+LQ== X-Forwarded-Encrypted: i=1; AJvYcCVsGKpafYBSPkoJWA2R5lHm6PAj2701e/5VNw5qlzujQTxnP5dQsVEXyGUj4JwaH3cwNnldBdOu4zw=@lists.php.net X-Gm-Message-State: AOJu0Ywp/8auMrcmOYglPR69hMh+/4kT6QEIoyFgYCdNVc29i2xJ1WGw HgvfLlO3T3iSkNaBFnyKeUGhtWLhmLU+73NQ8llA9bGQIUPDLgadOUk0yL4tqn4dWGPHwPRlHuZ EWopm2KPWxyWsTw60AvQ3Oax8EIjDxi8g3CB2 X-Gm-Gg: ASbGnct31NsB6lmaZ5y8gRHa6iRQg3Mu0SaJTVB/nBrGeog0GBQHBArC/Plz/i7M83x jPX+tily6yd6qz738indUJhfqCEe1WtYH0vo5QL6Q1L1VRhS+ZO6iw7Pb/NdL1aE4HF0DatRskm bGwN+unKBQeF43k83twOCZx/fbzxltTKwoYAp7OK09n0geOkt5KYXXoKZh/r2J4TpJrz5nRS1xs MDC/A== X-Google-Smtp-Source: AGHT+IE66Qmp3errMiiWnDi2nyNwsjFQ4Tob51/oot0vZs4nx57cDKzPlelX/Ge5w29PcTF10Wk7CgwrLvp2yHbaXa8= X-Received: by 2002:a17:90b:288d:b0:313:23ed:701 with SMTP id 98e67ed59e1d1-31a90afe51fmr815583a91.4.1751405247762; Tue, 01 Jul 2025 14:27:27 -0700 (PDT) Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 References: <348856E5-6A4E-455A-81AE-882832170168@rwec.co.uk> In-Reply-To: <348856E5-6A4E-455A-81AE-882832170168@rwec.co.uk> Date: Tue, 1 Jul 2025 23:27:14 +0200 X-Gm-Features: Ac12FXydSy1FG-nhxN_MkQoT2V6sQ3h7Fg_tMt8AunprmSWZjYnjcPMiD7NPufA Message-ID: Subject: Re: [PHP-DEV] [RFC][DISCUSSION] Add RFC 4648 compliant data encoding API To: "Rowan Tommins [IMSoP]" , PHP Internals List Content-Type: multipart/alternative; boundary="000000000000a697a00638e4cfb6" From: nyamsprod@gmail.com (ignace nyamagana butera) --000000000000a697a00638e4cfb6 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jul 1, 2025 at 1:09=E2=80=AFPM Rowan Tommins [IMSoP] wrote: > > > On 19 June 2025 12:01:04 BST, ignace nyamagana butera > wrote: > >RFC proposal link: https://wiki.php.net/rfc/data_encoding_api > > Thanks for working on this, I have often had to implement base64url and > been frustrated it's not just a built-in option. > > I like the look of the new API. Using namespaced enums is currently quite > verbose, but that's something we could try to fix at at the language leve= l > - e.g. Swift has some nice inference rules, so you can write the equivale= nt > of base64_encode($string, ::UrlSafe). > > One thing I think the RFC should mention is the future of the existing > base64_encode/decode functions. Am I right in thinking that with one > parameter, the new namespaced versions will be identical to the old? If s= o, > we have the option to make the existing functions aliases for the new. Or= , > we can leave them as-is, but plan to deprecate them. What we probably don= 't > want is to indefinitely have two versions with such similar names but > different signatures. > > Rowan Tommins > [IMSoP] > Hi Rowan, Currently the RFC does not address deprecating the current functions for the following reasons: - The current base64_decode function operates in a lenient mode by default, accepting characters outside the valid Base64 alphabet and ignoring the padding character wherever it is in the string. base64_decode('dG9=3D=3D=3D0bw??', false); // returns 'toto' However, the newly proposed lenient mode aligns with the stricter recommendations of RFC 4648, Section 12 , which advise rejecting inputs containing invalid characters due to potential security concerns. Consequently, the behavior differs significantly: while the current implementation tolerates non-alphabet characters and accepts padding characters in positions other than at the end of the encoded string, the proposed version enforces strict validation to enhance security and compliance with the standard. Encoding\base64_decode('dG90bw??', DecodingMode::Lenient); // will throw because of RFC 4648 security recommendation character outside of the base64 alphabet Encoding\base64_decode('dG9=3D=3D=3D0bw', DecodingMode::Lenient); // will t= hrow because of RFC 4648 security recommendation padding character not located at the end of the string Encoding\base64_decode('dG90bw', DecodingMode::Lenient); // returns 'toto' - hex2bin always operates in a lenient mode=E2=80=94it does not support str= ict validation. It could be replaced by the new base16_decode function when configured with appropriate options. However, it's important to note that the default behavior differs: unlike hex2bin, base16_decode defaults to strict mode, rejecting invalid input by design, consistent with all newly proposed decoding functions. For those reasons, I believe a clear deprecation and removal strategy for the current functions warrants its own dedicated RFC, as certain features cannot be easily migrated to the new API. --000000000000a697a00638e4cfb6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Tue, Jul 1, = 2025 at 1:09=E2=80=AFPM Rowan Tommins [IMSoP] <imsop.php@rwec.co.uk> wrote:


On 19 June 2025 12:01:04 BST, ignace nyamagana butera <nyamsprod@gmail.com> wrote:<= br> >RFC proposal link: https://wiki.php.net/rfc/data_encod= ing_api

Thanks for working on this, I have often had to implement base64url and bee= n frustrated it's not just a built-in option.

I like the look of the new API. Using namespaced enums is currently quite v= erbose, but that's something we could try to fix at at the language lev= el - e.g. Swift has some nice inference rules, so you can write the equival= ent of base64_encode($string, ::UrlSafe).

One thing I think the RFC should mention is the future of the existing base= 64_encode/decode functions. Am I right in thinking that with one parameter,= the new namespaced versions will be identical to the old? If so, we have t= he option to make the existing functions aliases for the new. Or, we can le= ave them as-is, but plan to deprecate them. What we probably don't want= is to indefinitely have two versions with such similar names but different= signatures.

Rowan Tommins
[IMSoP]

=C2=A0Hi Rowan,

<= /div>
Currently the RFC does not address deprecating the current functi= ons for the following reasons:

- The current base64_decode function operates in a lenient mode by default, accep= ting characters outside the valid Base64 alphabet and ignoring the=C2=A0pad= ding character wherever it=C2=A0is in the string.=C2=A0

=C2=A0base64_decode('dG9=3D=3D=3D0bw??', false); // returns &= #39;toto'

However, the newly proposed lenient = mode aligns with the stricter recommendations of = RFC 4648, Section 12, which advise rejecting inputs containing invalid = characters due to potential security concerns. Consequently, the behavior d= iffers significantly: while the current implementation tolerates non-alphab= et characters and accepts padding characters in positions other than at the= end of the encoded string, the proposed version enforces strict validation= to enhance security and compliance with the standard.

=
Encoding\base64_decode('dG90bw??', DecodingMode::Lenient); // = will throw because of RFC 4648 security recommendation character outside of= the base64 alphabet
Encoding\base64_decode('dG9=3D=3D=3D0bw&= #39;, DecodingMode::Lenient); // will throw because of RFC 4648 security re= commendation padding character not located at the end of the string
En= coding\base64_decode('dG90bw', DecodingMode::Lenient); // returns &= #39;toto'

-=C2=A0hex2bin always operate= s in a lenient mode=E2=80=94it does not support strict validation. It could= be replaced by the new base16_decode function when configured= with appropriate options. However, it's important to note that the def= ault behavior differs: unlike hex2bin, base16_decode defaults to strict mode, rejecting invalid input by design, consistent = with all newly proposed decoding functions.

For th= ose reasons, I believe a clear deprecation and removal strategy for the cur= rent functions warrants its own dedicated RFC, as certain features cannot b= e easily migrated to the new API.

--000000000000a697a00638e4cfb6--