Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:93429 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 19987 invoked from network); 22 May 2016 22:24:20 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 22 May 2016 22:24:20 -0000 Authentication-Results: pb1.pair.com smtp.mail=scott@paragonie.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=scott@paragonie.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain paragonie.com designates 209.85.218.53 as permitted sender) X-PHP-List-Original-Sender: scott@paragonie.com X-Host-Fingerprint: 209.85.218.53 mail-oi0-f53.google.com Received: from [209.85.218.53] ([209.85.218.53:34249] helo=mail-oi0-f53.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 36/70-14293-19132475 for ; Sun, 22 May 2016 18:24:18 -0400 Received: by mail-oi0-f53.google.com with SMTP id b65so108483701oia.1 for ; Sun, 22 May 2016 15:24:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paragonie-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-transfer-encoding; bh=PVwuKzejK7mLsZL1gxRVU2wJNnPAiZv0GDI/JsqH3No=; b=xy04wvDdQeVgjKIQqyPt9y8VjonKi9sDR/pQ/W2lJVEg4IfQGXe1ody7IZp1DYJ/6W uP43thAiEOzYUnKow+hxnBdbGctZb2vFrlpBztnTXPJggnsjSL4IxzQ+4fbB7cbeTWtL llKAsRfsTECReWpgRyyZANmHu27XAy5VYGuvWkTcAb4YUe2T7n44NdYEHegUV4PgoA3u OkMTDjSRIUiH2qyiewrFMcSIQiDlcFj4QZjxrXNY1ZNe5IZHAWVFDOVfIuZGXPATalu8 Ogz/K1BJ2V2k+UabzTuDmctR0i+T+GfGiex7oibXBgGEMZ9tn2LgiUA0VO7IOsqzrELN N9cA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-transfer-encoding; bh=PVwuKzejK7mLsZL1gxRVU2wJNnPAiZv0GDI/JsqH3No=; b=JKbVw/Erbu3ah3kowOV3AS3C46ZWxA2Sx+7pZh2UFBFOrOnML6ijz3y/O8+1ak+s2o KvpLrJbYzaJOc2PHEVejUcbKbFPMYamqcgl4YxgByScPonhBQdFUXTwuvsceHy3ITE3O Nx9e5gsEjUo2jLGK7Pz/3Oj1OVg4MWBTfD5oNDuE8eil+P3PoBXN7el3lNOyS3kyRNNq UXkxtXZ2LaZvdbkLgGOe0WZJNW8XqdRdBdE6kFvaasUH93uHSMMbEhSX54zAklTs7fRU z3TQ1Lbuur+KERbKj43lc8apOg2Ji0qWdVJs17QXTHXkGish41SnmeBwaOky3Frh5KrW qW8Q== X-Gm-Message-State: ALyK8tIPyPlOkhcCvYhK1YCqWvyfaGfAR7FyCxtUcLWjcp8ap/3JyzY91fjMs65DwKyPwMjbgFUkzrgHLPr3xA== MIME-Version: 1.0 X-Received: by 10.202.186.193 with SMTP id k184mr66734oif.66.1463955855184; Sun, 22 May 2016 15:24:15 -0700 (PDT) Received: by 10.157.10.12 with HTTP; Sun, 22 May 2016 15:24:15 -0700 (PDT) In-Reply-To: References: Date: Sun, 22 May 2016 18:24:15 -0400 Message-ID: To: PHP Internals Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] base64_decode is buggy, what to fix? From: scott@paragonie.com (Scott Arciszewski) On Sun, May 22, 2016 at 6:56 AM, Lauri Kentt=C3=A4 = wrote: > Hello, Internals! > > I was fixing #72152 when it became apparent that the base64_decode functi= on > is very buggy. > > > - Null byte ends processing. > > - "V" produces empty result, while "V=3D" fails. Not very logical. > > - Too short padding is allowed, e.g. "VV=3D" works like "VV=3D=3D". > > - Extra padding is allowed (like "V=3D=3D=3D=3D=3D"). > > - Invalid padding is allowed ("=3DVVV=3D", "VV=3DV=3D", "VVV=3D=3D") exce= pt on the > second place of a 24-bit run ("V=3DVV=3D" fails). > > - In strict mode, space between padding fails: > "V V=3D=3D" and "VV =3D=3D" and "VV=3D=3D " are allowed, > "VV=3D =3D" fails. > > - In strict mode, after a padding, one character is skipped, so "VVV=3DV" > decodes to "UU" (should be "UUU"), and "VVVV=3D*" decodes to "UUU" instea= d of > failing. > > > For each of the above, what would be the preferred behaviour in default m= ode > and strict mode? > > > Affected existing tests: > > - ext/openssl/tests/bug61124.phpt uses "kzo w2RMExUTYQXW2Xzxmg=3D=3D" as = an > invalid base64 string, based on the invalid padding. > > - ext/standard/tests/file/stream_rfc2397_006.phpt tests > "#Zm9vYmFyIGZvb2Jhcg=3D=3D" and excepts this to be valid, while "#" is cl= early > not valid base64. This also raises a question whether fragments should be > skipped in data uri handling. > > > Suggestions? > > I've created a bug-for-bug compatible rewrite of base64_decode [1], with = all > the bugs neatly and specifically implemented and missing features comment= ed > out, so it's now very simple to fix them one by one. > > I've also attached a test script that tests "all" possible combinations o= f > data, padding, NUL and other invalid characters, and my first patch indee= d > provides identical results to the old implementation. > > > Currently interesting lines in the test results: > 'base64' 'default' 'strict' > 'V' '' '' > 'V=3D' (false) (false) > 'VV=3D' 'U' 'U' > 'VV=3D=3D' 'U' 'U' > 'V=3D=3D=3D=3D=3D' (false) (false) > '=3DVVV=3D' 'UU' (false) > 'VV=3DV=3D' 'UU' (false) > 'VVV=3D=3D' 'UU' 'UU' > 'V=3DVV=3D' (false) (false) > 'V V=3D=3D' 'U' 'U' > 'VV =3D=3D' 'U' 'U' > 'VV=3D=3D ' 'U' 'U' > 'VV=3D =3D' 'U' (false) > 'VVV=3DV' 'UUU' 'UU' > 'VVVV=3D*' 'UUU' 'UUU' > 'VVVVVV=3DV' 'UUUUU' 'UUUU' > 'VVVVVV=3D*' 'UUUU' 'UUUU' > 'VVVV=3D=3D=3D*' 'UUU' 'UUU' > 'VVV=3D=3D=3D=3DV' 'UUU' 'UU' > 'VVV=3D=3D=3D=3D*' 'UU' 'UU' > 'VV=3D=3D=3D=3D=3DV' 'UU' 'U' > 'VV=3D=3D=3D=3D=3D*' 'U' 'U' > '=3D=3D=3D=3D=3D=3D=3D*' '' '' > > -- > Lauri Kentt=C3=A4 > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php I'm surprised to see that these functions weren't heavily tested. I've included some of the RFC 4648 test vectors in my cache-timing-safe userland implementation at https://github.com/paragonie/constant_time_encoding -- maybe this would be useful for base64_decode() as well? Scott Arciszewski Chief Development Officer Paragon Initiative Enterprises