Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:93428 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 97369 invoked from network); 22 May 2016 10:56:39 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 22 May 2016 10:56:39 -0000 Authentication-Results: pb1.pair.com header.from=lauri.kentta@gmail.com; sender-id=softfail Authentication-Results: pb1.pair.com smtp.mail=lauri.kentta@gmail.com; spf=softfail; sender-id=softfail Received-SPF: softfail (pb1.pair.com: domain gmail.com does not designate 188.117.41.47 as permitted sender) X-PHP-List-Original-Sender: lauri.kentta@gmail.com X-Host-Fingerprint: 188.117.41.47 mailgateway.locotech.fi Linux 2.6 Received: from [188.117.41.47] ([188.117.41.47:56092] helo=mailgateway.locotech.fi) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 9D/01-17068-46091475 for ; Sun, 22 May 2016 06:56:37 -0400 Received: from localhost (mailgateway [127.0.0.1]) by mailgateway.locotech.fi (Postfix) with ESMTP id E6F23A36051 for ; Sun, 22 May 2016 13:56:33 +0300 (EEST) X-Virus-Scanned: amavisd-new at locotech.fi X-Spam-Flag: NO X-Spam-Score: -1.998 X-Spam-Level: X-Spam-Status: No, score=-1.998 tagged_above=-9998 required=5 tests=[ALL_TRUSTED=-1, BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001, NML_ADSP_CUSTOM_MED=0.9] autolearn=no autolearn_force=no Received: from mailgateway.locotech.fi ([127.0.0.1]) by localhost (mailgateway.locotech.fi [127.0.0.1]) (amavisd-new, port 10024) with LMTP id DP9r2r_px5Ab for ; Sun, 22 May 2016 13:56:21 +0300 (EEST) Received: from posti.fimnet.fi (posti.fimnet.fi [172.16.1.44]) by mailgateway.locotech.fi (Postfix) with ESMTP id BF20EA36016 for ; Sun, 22 May 2016 13:56:21 +0300 (EEST) Received: from k-piste.dy.fi (unknown [172.16.1.39]) by posti.fimnet.fi (Postfix) with ESMTPSA id B760E100655 for ; Sun, 22 May 2016 13:56:21 +0300 (EEST) Received: from localhost.localdomain ([::1] helo=k-piste.dy.fi) by k-piste.dy.fi with esmtp (Exim 4.87) (envelope-from ) id 1b4R3V-0000ca-CB for internals@lists.php.net; Sun, 22 May 2016 13:56:21 +0300 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=_6d0492583f841c477c9ad39ab530e767" Date: Sun, 22 May 2016 13:56:21 +0300 To: internals@lists.php.net Reply-To: internals@lists.php.net Mail-Reply-To: internals@lists.php.net Mail-Followup-To: internals@lists.php.net Message-ID: X-Sender: lauri.kentta@gmail.com User-Agent: Roundcube Webmail/1.2-git Subject: base64_decode is buggy, what to fix? From: lauri.kentta@gmail.com (=?UTF-8?Q?Lauri_Kentt=C3=A4?=) --=_6d0492583f841c477c9ad39ab530e767 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8; format=flowed Hello, Internals! I was fixing #72152 when it became apparent that the base64_decode function is very buggy. - Null byte ends processing. - "V" produces empty result, while "V=" fails. Not very logical. - Too short padding is allowed, e.g. "VV=" works like "VV==". - Extra padding is allowed (like "V====="). - Invalid padding is allowed ("=VVV=", "VV=V=", "VVV==") except on the second place of a 24-bit run ("V=VV=" fails). - In strict mode, space between padding fails: "V V==" and "VV ==" and "VV== " are allowed, "VV= =" fails. - In strict mode, after a padding, one character is skipped, so "VVV=V" decodes to "UU" (should be "UUU"), and "VVVV=*" decodes to "UUU" instead of failing. For each of the above, what would be the preferred behaviour in default mode and strict mode? Affected existing tests: - ext/openssl/tests/bug61124.phpt uses "kzo w2RMExUTYQXW2Xzxmg==" as an invalid base64 string, based on the invalid padding. - ext/standard/tests/file/stream_rfc2397_006.phpt tests "#Zm9vYmFyIGZvb2Jhcg==" and excepts this to be valid, while "#" is clearly not valid base64. This also raises a question whether fragments should be skipped in data uri handling. Suggestions? I've created a bug-for-bug compatible rewrite of base64_decode [1], with all the bugs neatly and specifically implemented and missing features commented out, so it's now very simple to fix them one by one. I've also attached a test script that tests "all" possible combinations of data, padding, NUL and other invalid characters, and my first patch indeed provides identical results to the old implementation. Currently interesting lines in the test results: 'base64' 'default' 'strict' 'V' '' '' 'V=' (false) (false) 'VV=' 'U' 'U' 'VV==' 'U' 'U' 'V=====' (false) (false) '=VVV=' 'UU' (false) 'VV=V=' 'UU' (false) 'VVV==' 'UU' 'UU' 'V=VV=' (false) (false) 'V V==' 'U' 'U' 'VV ==' 'U' 'U' 'VV== ' 'U' 'U' 'VV= =' 'U' (false) 'VVV=V' 'UUU' 'UU' 'VVVV=*' 'UUU' 'UUU' 'VVVVVV=V' 'UUUUU' 'UUUU' 'VVVVVV=*' 'UUUU' 'UUUU' 'VVVV===*' 'UUU' 'UUU' 'VVV====V' 'UUU' 'UU' 'VVV====*' 'UU' 'UU' 'VV=====V' 'UU' 'U' 'VV=====*' 'U' 'U' '=======*' '' '' -- Lauri Kenttä --=_6d0492583f841c477c9ad39ab530e767 Content-Transfer-Encoding: base64 Content-Type: text/x-php; name=base64_decode_test.php Content-Disposition: attachment; filename=base64_decode_test.php; size=811 PD9waHAKZnVuY3Rpb24gc3RyX2Zvcm1hdCgkcykgewoJcmV0dXJuIGlzX3N0cmluZygkcykgPyAi JyRzJyIgOiAoJHMgPT09IGZhbHNlID8gIihmYWxzZSkiIDogIigiLmdldHR5cGUoJHMpLiIgPz8/ KSIpOwp9CmZ1bmN0aW9uIGxpbmUoJGEsICRiLCAkYykgewoJI2lmICgkYyAhPT0gZmFsc2UgJiYg KCRiICE9PSAkYyB8fCBzdHJwb3MoJGEsICIqIikgIT09IGZhbHNlIHx8IHN0cnBvcygkYSwgIlww IikgIT09IGZhbHNlKSkKCXByaW50ZigiJS04cyBcdCAlLThzIFx0ICUtOHNcbiIsIHN0cl9mb3Jt YXQoJGEpLCBzdHJfZm9ybWF0KCRiKSwgc3RyX2Zvcm1hdCgkYykpOwp9CgpsaW5lKCJiYXNlNjQi LCAiZGVmYXVsdCIsICJzdHJpY3QiKTsKZm9yZWFjaCAoWyJWIiwgIlY9IiwgIlZWPSIsICJWVj09 IiwgIlY9PT09PSIsICI9VlZWPSIsICJWVj1WPSIsICJWVlY9PSIsICJWPVZWPSIsICJWIFY9PSIs ICJWViA9PSIsICJWVj09ICIsICJWVj0gPSIsICJWVlY9ViIsICJWVlZWPSoiXSBhcyAkdikgewoJ bGluZSgkdiwgYmFzZTY0X2RlY29kZSgkdiksIGJhc2U2NF9kZWNvZGUoJHYsIHRydWUpKTsKfQoK JHQgPSAiVj0qXDAiOwpmb3IgKCRpID0gMDsgJGkgPCAoMTw8MTYpOyArKyRpKSB7CgkkdiA9IGx0 cmltKCR0WygkaT4+MTQpJjNdLiR0WygkaT4+MTIpJjNdLiR0WygkaT4+MTApJjNdLiR0WygkaT4+ OCkmM10uJHRbKCRpPj42KSYzXS4kdFsoJGk+PjQpJjNdLiR0WygkaT4+MikmM10uJHRbKCRpPj4w KSYzXSwgIlwwIik7CglsaW5lKCR2LCBiYXNlNjRfZGVjb2RlKCR2KSwgYmFzZTY0X2RlY29kZSgk diwgdHJ1ZSkpOwp9Cg== --=_6d0492583f841c477c9ad39ab530e767 Content-Transfer-Encoding: base64 Content-Type: text/x-diff; name=base64_v0.patch Content-Disposition: attachment; filename=base64_v0.patch; size=4321 Y29tbWl0IGEzODUxMGVhYzdiODUyYmRiOGQ1ODAxODQ4MDJmMGQ1MDkwYjM1ZmQKQXV0aG9yOiBM YXVyaSBLZW50dMOkIDxsYXVyaS5rZW50dGFAZ21haWwuY29tPgpEYXRlOiAgIFN1biBNYXkgMjIg MTM6MTE6NDcgMjAxNiArMDMwMAoKICAgIGJhc2U2NF9kZWNvZGU6IHJlaW1wbGVtZW50IGNsZWFu bHkgKGJ1Zy1mb3ItYnVnKQoKZGlmZiAtLWdpdCBhL2V4dC9zdGFuZGFyZC9iYXNlNjQuYyBiL2V4 dC9zdGFuZGFyZC9iYXNlNjQuYwppbmRleCA4MWY4MjZjLi5iYzYzMzI5IDEwMDY0NAotLS0gYS9l eHQvc3RhbmRhcmQvYmFzZTY0LmMKKysrIGIvZXh0L3N0YW5kYXJkL2Jhc2U2NC5jCkBAIC0xMzUs NzUgKzEzNSw5OSBAQCBQSFBBUEkgemVuZF9zdHJpbmcgKnBocF9iYXNlNjRfZGVjb2RlKGNvbnN0 IHVuc2lnbmVkIGNoYXIgKnN0ciwgc2l6ZV90IGxlbmd0aCkgLwogCiBQSFBBUEkgemVuZF9zdHJp bmcgKnBocF9iYXNlNjRfZGVjb2RlX2V4KGNvbnN0IHVuc2lnbmVkIGNoYXIgKnN0ciwgc2l6ZV90 IGxlbmd0aCwgemVuZF9ib29sIHN0cmljdCkgLyoge3t7ICovCiB7Ci0JY29uc3QgdW5zaWduZWQg Y2hhciAqY3VycmVudCA9IHN0cjsKLQlpbnQgY2gsIGkgPSAwLCBqID0gMCwgazsKLQkvKiB0aGlz IHN1Y2tzIGZvciB0aHJlYWRlZCBlbnZpcm9ubWVudHMgKi8KKwlpbnQgdmFsX2I2NCwgcGFkZGlu ZyA9IDA7CisJc2l6ZV90IGksIG5faW4gPSAwLCBuX291dCA9IDA7CiAJemVuZF9zdHJpbmcgKnJl c3VsdDsKIAotCXJlc3VsdCA9IHplbmRfc3RyaW5nX2FsbG9jKGxlbmd0aCwgMCk7CisJcmVzdWx0 ID0gemVuZF9zdHJpbmdfYWxsb2MoKGxlbmd0aCArIDMpIC8gNCAqIDMsIDApOwogCiAJLyogcnVu IHRocm91Z2ggdGhlIHdob2xlIHN0cmluZywgY29udmVydGluZyBhcyB3ZSBnbyAqLwotCXdoaWxl ICgoY2ggPSAqY3VycmVudCsrKSAhPSAnXDAnICYmIGxlbmd0aC0tID4gMCkgewotCQlpZiAoY2gg PT0gYmFzZTY0X3BhZCkgewotCQkJaWYgKCpjdXJyZW50ICE9ICc9JyAmJiAoKGkgJSA0KSA9PSAx IHx8IChzdHJpY3QgJiYgbGVuZ3RoID4gMCkpKSB7Ci0JCQkJaWYgKChpICUgNCkgIT0gMSkgewot CQkJCQl3aGlsZSAoaXNzcGFjZSgqKCsrY3VycmVudCkpKSB7Ci0JCQkJCQljb250aW51ZTsKLQkJ CQkJfQotCQkJCQlpZiAoKmN1cnJlbnQgPT0gJ1wwJykgewotCQkJCQkJY29udGludWU7Ci0JCQkJ CX0KKwlmb3IgKGkgPSAwOyBpIDwgbGVuZ3RoOyArK2kpIHsKKwkJLyogc3RvcCBvbiBudWxsIGJ5 dGUgICovCisJCS8qIEZJWE1FOiB0aGlzIGlzIHdyb25nIGJlaGF2aW91ciwgcmVtb3ZlIHRoaXMh ICovCisJCWlmIChzdHJbaV0gPT0gMCkgeworCQkJYnJlYWs7CisJCX0KKwkJLyogY291bnQgcGFk ZGluZyBjaGFyYWN0ZXJzICovCisJCWlmIChzdHJbaV0gPT0gYmFzZTY0X3BhZCkgeworCQkJLyog ZmFpbCBpZiB0aGUgcGFkZGluZyBjaGFyYWN0ZXIgaXMgc2Vjb25kIGluIGEgZ3JvdXAgKGxpa2Ug QT09PSkgKi8KKwkJCS8qIEZJWE1FOiB3aHkgd2Ugc3RpbGwgYWxsb3cgaW52YWxpZCBwYWRkaW5n IGluIG90aGVyIHBsYWNlcyBpbiB0aGUgbWlkZGxlIG9mIHRoZSBzdHJpbmc/ICovCisJCQlpZiAo bl9pbiAlIDQgPT0gMSkgeworCQkJCWdvdG8gZmFpbDsKKwkJCX0KKwkJCS8qIGluIHN0cmljdCBt b2RlLCB3aGVuIHRoZSBwYWRkaW5nIGVuZHMsIHNraXAgb25lIChhbnkpIGNoYXJhY3Rlciwgc2tp cCB3aGl0ZXNwYWNlcywKKwkJCSAqIGFuZCByZXR1cm4gRkFMU0UgaWYgdGhlIG5leHQgY2hhcmFj dGVyIGlzIG5vdCBOVUwsIG90aGVyd2lzZSByZXR1cm4gdGhlIGN1cnJlbnQgZGVjb2RlZCBzdHJp bmcgKi8KKwkJCS8qIEZJWE1FOiB0aGlzIGlzIHdyb25nIGJlaGF2aW91ciBhbmQgbWF5IHJlYWQg cGFzdC10aGUtZW5kLCByZW1vdmUgdGhpcyEgKi8KKwkJCWlmIChzdHJpY3QgJiYgaSAhPSBsZW5n dGggLSAxICYmIHN0cltpKzFdICE9IGJhc2U2NF9wYWQpIHsKKwkJCQlpICs9IDI7CisJCQkJd2hp bGUgKGlzc3BhY2Uoc3RyW2ldKSkgeworCQkJCQlpICs9IDE7CiAJCQkJfQotCQkJCXplbmRfc3Ry aW5nX2ZyZWUocmVzdWx0KTsKLQkJCQlyZXR1cm4gTlVMTDsKKwkJCQlpZiAoc3RyW2ldID09IDAp IHsKKwkJCQkJYnJlYWs7CisJCQkJfQorCQkJCWdvdG8gZmFpbDsKKwkJCX0KKwkJCS8qIHN0cmlj dDogZmFpbCBpZiB0aGVyZSBpcyBhIHNwYWNlIGJldHdlZW4gcGFkZGluZyBjaGFyYWN0ZXJzICov CisJCQkvKiBGSVhNRTogdGhpcyBpcyB3cm9uZyBiZWhhdmlvdXIsIHJlbW92ZSB0aGlzISAqLwor CQkJaWYgKHN0cmljdCAmJiBwYWRkaW5nICYmIHN0cltpLTFdICE9IGJhc2U2NF9wYWQpIHsKKwkJ CQlnb3RvIGZhaWw7CiAJCQl9CisJCQkvKiBzdHJpY3Q6IG1heGltdW0gcGFkZGluZyBpcyB0d28g Y2hhcmFjdGVycyAqLworCQkJLyogRklYTUU6IGVuYWJsZSB0aGlzIQorCQkJaWYgKHN0cmljdCAm JiBwYWRkaW5nID09IDIpIHsKKwkJCQlnb3RvIGZhaWw7CisJCQl9CisJCQkqLworCQkJKytwYWRk aW5nOwogCQkJY29udGludWU7CiAJCX0KLQotCQljaCA9IGJhc2U2NF9yZXZlcnNlX3RhYmxlW2No XTsKLQkJaWYgKCghc3RyaWN0ICYmIGNoIDwgMCkgfHwgY2ggPT0gLTEpIHsgLyogYSBzcGFjZSBv ciBzb21lIG90aGVyIHNlcGFyYXRvciBjaGFyYWN0ZXIsIHdlIHNpbXBseSBza2lwIG92ZXIgKi8K KwkJdmFsX2I2NCA9IGJhc2U2NF9yZXZlcnNlX3RhYmxlW3N0cltpXV07CisJCS8qIHNwYWNlcyBh bmQgdW5rbm93biBjaGFyYWN0ZXJzICovCisJCWlmICh2YWxfYjY0IDwgMCkgeworCQkJLyogc3Ry aWN0OiBmYWlsIG9uIHVua25vd24gY2hhcmFjdGVycyAqLworCQkJaWYgKHN0cmljdCAmJiB2YWxf YjY0ID09IC0yKSB7CisJCQkJZ290byBmYWlsOworCQkJfQogCQkJY29udGludWU7Ci0JCX0gZWxz ZSBpZiAoY2ggPT0gLTIpIHsKLQkJCXplbmRfc3RyaW5nX2ZyZWUocmVzdWx0KTsKLQkJCXJldHVy biBOVUxMOwogCQl9CisJCS8qIHN0cmljdDogZmFpbCBpZiBkYXRhIGZvbGxvd3MgcGFkZGluZyAq LworCQlpZiAoc3RyaWN0ICYmIHBhZGRpbmcpIHsKKwkJCWdvdG8gZmFpbDsKKwkJfQorCQkvKiBm b3JnZXQgaW52YWxpZCBwYWRkaW5nICovCisJCXBhZGRpbmcgPSAwOwogCi0JCXN3aXRjaChpICUg NCkgeworCQlzd2l0Y2ggKG5faW4rKyAlIDQpIHsKIAkJY2FzZSAwOgotCQkJWlNUUl9WQUwocmVz dWx0KVtqXSA9IGNoIDw8IDI7CisJCQlaU1RSX1ZBTChyZXN1bHQpW25fb3V0XSA9IHZhbF9iNjQg PDwgMjsKIAkJCWJyZWFrOwogCQljYXNlIDE6Ci0JCQlaU1RSX1ZBTChyZXN1bHQpW2orK10gfD0g Y2ggPj4gNDsKLQkJCVpTVFJfVkFMKHJlc3VsdClbal0gPSAoY2ggJiAweDBmKSA8PCA0OworCQkJ WlNUUl9WQUwocmVzdWx0KVtuX291dCsrXSB8PSB2YWxfYjY0ID4+IDQ7CisJCQlaU1RSX1ZBTChy ZXN1bHQpW25fb3V0XSA9ICh2YWxfYjY0ICYgMHgwZikgPDwgNDsKIAkJCWJyZWFrOwogCQljYXNl IDI6Ci0JCQlaU1RSX1ZBTChyZXN1bHQpW2orK10gfD0gY2ggPj4yOwotCQkJWlNUUl9WQUwocmVz dWx0KVtqXSA9IChjaCAmIDB4MDMpIDw8IDY7CisJCQlaU1RSX1ZBTChyZXN1bHQpW25fb3V0Kytd IHw9IHZhbF9iNjQgPj4gMjsKKwkJCVpTVFJfVkFMKHJlc3VsdClbbl9vdXRdID0gKHZhbF9iNjQg JiAweDAzKSA8PCA2OwogCQkJYnJlYWs7CiAJCWNhc2UgMzoKLQkJCVpTVFJfVkFMKHJlc3VsdClb aisrXSB8PSBjaDsKKwkJCVpTVFJfVkFMKHJlc3VsdClbbl9vdXQrK10gfD0gdmFsX2I2NDsKIAkJ CWJyZWFrOwogCQl9Ci0JCWkrKzsKIAl9Ci0KLQlrID0gajsKLQkvKiBtb3AgdGhpbmdzIHVwIGlm IHdlIGVuZGVkIG9uIGEgYm91bmRhcnkgKi8KLQlpZiAoY2ggPT0gYmFzZTY0X3BhZCkgewotCQlz d2l0Y2goaSAlIDQpIHsKLQkJY2FzZSAxOgotCQkJemVuZF9zdHJpbmdfZnJlZShyZXN1bHQpOwot CQkJcmV0dXJuIE5VTEw7Ci0JCWNhc2UgMjoKLQkJCWsrKzsKLQkJY2FzZSAzOgotCQkJWlNUUl9W QUwocmVzdWx0KVtrXSA9IDA7Ci0JCX0KKwkvKiBGSVhNRTogZmFpbCBpZiB0aGUgbGFzdCAyNC1i aXQgc2VxdWVuY2UgaGFkIG9ubHkgNiBiaXRzIHNldCAobGlrZSBBPT09KQorCWlmIChuX2luICUg NCA9PSAxKSB7CisJCWdvdG8gZmFpbDsKIAl9Ci0JWlNUUl9MRU4ocmVzdWx0KSA9IGo7CisJKi8K KworCVpTVFJfTEVOKHJlc3VsdCkgPSBuX291dDsKIAlaU1RSX1ZBTChyZXN1bHQpW1pTVFJfTEVO KHJlc3VsdCldID0gJ1wwJzsKIAogCXJldHVybiByZXN1bHQ7CitmYWlsOgorCXplbmRfc3RyaW5n X2ZyZWUocmVzdWx0KTsKKwlyZXR1cm4gTlVMTDsKIH0KIC8qIH19fSAqLwogCg== --=_6d0492583f841c477c9ad39ab530e767--