Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:100457 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 90297 invoked from network); 7 Sep 2017 21:33:06 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 Sep 2017 21:33:06 -0000 Authentication-Results: pb1.pair.com smtp.mail=yohgaki@ohgaki.net; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=yohgaki@ohgaki.net; sender-id=pass Received-SPF: pass (pb1.pair.com: domain ohgaki.net designates 180.42.98.130 as permitted sender) X-PHP-List-Original-Sender: yohgaki@ohgaki.net X-Host-Fingerprint: 180.42.98.130 ns1.es-i.jp Received: from [180.42.98.130] ([180.42.98.130:35094] helo=es-i.jp) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 15/87-10715-D0BB1B95 for ; Thu, 07 Sep 2017 17:33:04 -0400 Received: (qmail 45243 invoked by uid 89); 7 Sep 2017 21:32:57 -0000 Received: from unknown (HELO mail-io0-f182.google.com) (yohgaki@ohgaki.net@209.85.223.182) by 0 with ESMTPA; 7 Sep 2017 21:32:57 -0000 Received: by mail-io0-f182.google.com with SMTP id n69so844155ioi.5 for ; Thu, 07 Sep 2017 14:32:56 -0700 (PDT) X-Gm-Message-State: AHPjjUgMALFKBe9hhnyZFRN1aqU51V2ZKjEgUndjT9lw72pxP/gFe4h6 LpiP+TnzAJTRLuxkd9HrXgOkWPESpQ== X-Google-Smtp-Source: AOwi7QA3uSuC/SGiukAZ76b5i0tYt183pJyOm0qrK2zMpxcFtwjyhDINYW6NaciltfO7563ConItz3t5owrRLIDnepc= X-Received: by 10.107.170.227 with SMTP id g96mr780977ioj.311.1504819970755; Thu, 07 Sep 2017 14:32:50 -0700 (PDT) MIME-Version: 1.0 Received: by 10.79.72.5 with HTTP; Thu, 7 Sep 2017 14:32:09 -0700 (PDT) In-Reply-To: References: Date: Fri, 8 Sep 2017 06:32:09 +0900 X-Gmail-Original-Message-ID: Message-ID: To: "internals@lists.php.net" Cc: Nikita Popov Content-Type: multipart/alternative; boundary="001a1142dd30ced5790558a033cb" Subject: Re: hash_hkdf() signature and return value From: yohgaki@ohgaki.net (Yasuo Ohgaki) --001a1142dd30ced5790558a033cb Content-Type: text/plain; charset="UTF-8" Hi RMs, I suppose nobody can give example(s) that justify current API. I'll leave this issue to RMs decision, since I think this result in no conclusion. Please consider carefully if the current API should be kept or not. I wrote summary for discussion for you. Regards, Misunderstandings summary from previous discussion: Nikita misunderstood what HKDF is made for. He said key derivation that generates AES 128 bit key from AES 256 bit key for compatibility reason as follows is "textbook example" and good(recommended?) usecase. $sha128key = hash_hkdf("sha256", $sha256key, 16); Although this could be acceptable useage when salt cannot be used, but not recommended. HKDF is specifically designed to discourage this kind of key derivation because this is weak. The RFC strongly encourage salt use for key security. Appropriate key derivation is $sha128key = hash_hkdf("sha256", $sha256key, 32, "", $csprng_generated_random_key); // Store $cspring_generated_ random_key to somewhere. The reason why latter is a lot more secure is related to Andrey's misunderstanding. He said "when ikm is cryptographically strong, salt wouldn't add no more entropy. so salt does not matter". (not exact sentence) What he said partially true, but wrong in a sense of key security. When ikm has enough entropy, HKDF does not add more entropy. However, even when ikm is strong, additional salt adds significant complexities for key analysis. To make things simpler, suppose we have very simple KDF as follows. okm = VSKDF(ikm, salt) where - okm, ikm, salt is 2 digit int - when salt is NULL, default to 2 - VSKDF is simple function that multiply ikm and salt (ikm * salt) returns last 2 digits as derived key. When ikm=64, salt=NULL 28 = VSKDF(64, NULL); Since algorithm is known, attackers are able to guess ikm very easily. Now, suppose salt=16 is provided 24 = VSKDF(64, 16); It is still easy to guess due to very simple KDF, but guessing ikm is a lot harder than previous example while it does not add any entropy. HKDF has a lot more complex algorithm, therefore salt gives significantly stronger key protection even when salt is disclosed as noted in the RFC. Other misunderstanding should be noted is what HKDF for. It's for general purpose KDF as the RFC described in HKDF application section. Andrey said "I'm cherry picking sentence", but the section should be what the HKDF for. The section even describes obscure usage, HKDF for CSPRNG. This usage is not even key derivation. This one I'm not sure misunderstanding or not, but probably it is. HKDF is designed for any ikm and works with appropriate usage. Very weak ikm like user entered password can be handled relatively safely. $safe_key = hash_hkdf("sha256", 'mypassword', 0, '', $csprng_generated_random_key); // $csprng_generated_random_key should be kept secret because ikm is too weak Without salt, it's disaster. Please note that salt is the last optional parameter currently. $dangerous_key = hash_hkdf("sha256", 'mypassword'); // Disaster! With this usage, attackers can build pre hashed dictionary. Even when they don't have dictionary, usual brute force attack is very effective. One may think additional hashing would mitigate risk. i.e. $dangerous_key = hash_hkdf("sha256", hash("sha256", 'mypassword')); // Disaster! This does not help much when algorithm is known to attackers. Brute force attack is effective still. Adding secret salt(key) helps with this usage also. Please remember crypt() abuse history. We should imagine similar thing could happen with hash_hkdf() API. Current API encourages misuse/abuse too much by making salt the last optional parameter. Better documentation may help, "Users must use random $salt always whenever it is possible even if it is the last optional parameter.". However, do we really keep this mess forever? for a new function? (BTW, Nikita, are you going to take your statement back or not. It's too impolite) -- Yasuo Ohgaki yohgaki@ohgaki.net On Wed, Sep 6, 2017 at 10:15 AM, Yasuo Ohgaki wrote: > Hi all, > > This is the last recommendation for hash_hkdf[1]. In fact, > this would be the last chance to fix because we'll have 7.2 soon. > The issue is secure usage and API consistency. > > Currently hash_hkdf() has following signature: > > hash_hkdf(string $algo , string $ikm [, int $length = 0 [, > string $info = '' [, string $salt = '' ]]] ) > > These are rationals behind recommendation. There are many, but > please read on. > > === Parameter Order === > > HKDF[2] algorithm is: > 1. General purpose key derivation function as per RFC 5869 > 2. "$salt" parameter is a "pre-shared _KEY_" in many cases as mentioned > RFC 5869 > 3. "$salt" (or preshared key) is very strongly recommended for security > season as per RFC 5869 > 4. Supplying salt that the same length of input key does not affect > performance as per RFC 5969 > 5. "$info" is what makes HKDF useful, but it's less important as > described in RFC 5869 > 6. "$length" is truly an optional parameter for very specific encryption > algorithm or usage. > > Rationale for change: > 1. Key derivations without secondary key ($salt) does not make sense when > secondary key could be available. HKDF is designed for best possible > key > security with the key. Not using secondary key ($salt) simply > downgrades > key security without no reason. i.e. HKDF performance is the same > when $salt has the same as hash is set. > 2. HKDF is based on HMAC. When $info has no use, HMAC would be the best > choice for it. i.e. $newkey = hash_hmac($ikm, $key); > 3. It does not make sense violating RFC recommendations for a RFC > implementation. > > From these facts and reasons, $salt, $info and $length parameter order and > requirement should be changed from > > string $algo , string $ikm [, int $length = 0 [, string $info = '' [, > string $salt = '' ]]] > > to > > string $algo , string $ikm , string $salt, string $info = '' [, > int $length = 0 ] > Note: Users can set empty string if they really don't need $salt and/or > $info. > > Conclusion: > This way, users would have better chances to use hash_hkdf() more securely > and > properly. > > [1] http://php.net/hash_hkdf > [2] http://www.faqs.org/rfcs/rfc5869.html > > === Return Value and Output Option === > > The most common HKDF usage with PHP would be: > 1. CSRF token generation that is specific to a request with expiration > time. > (HEX return value would be appropriate, not BINARY) > 2. API access token generation that does not transmit "The API Key", but > derived key by HKDF. It also should have expiration time. > (HEX return value would be appropriate, not BINARY) > > Consistency with other hash_*() functions: > 1. All of other hash_*() returns HEX string hash value. > 2. hash_hkdf() is the only one returns BINARY hash value. > > Conclusion: > hash_hkdf() should return HEX by default, not BINARY. > Optional [, bool $raw_output = false ] should be added just like other > hash_*(). > > > === Compatibility === > > IMHO, current hash_hkdf() should not be added by PHP 7.1.2, but 7.2.0 > in the first place. The mess could be resolved by 7.2. > > Anyway, hash_hkdf() is added 7.1.2. Hopefully not many users are > using it yet. If we change API with 7.2 release, there would be least > possible confusions. (We may remove it from 7.1 to avoid confusions, too) > > Our choices: > - Keep the current insecure/inconsistent API forever. > - Change the API to have secure/consistent API forever. > > Conclusion: > No conclusion for this. There would be conflicting options. > > > I strongly think it is not worth to keep this insecure/inconsistent API > forever. > I prefer to change the API to what it should be. > > What should we do for this? > Comments? > > > > P.S. > > Nikita, you've said following during HKDF discussion: > - HKDF for CSRF tokens/etc does not make sense at all. > - Current parameter order and return value makes perfect sense > and has valid/common usages. > - Everyone one this list, shouldn't listen to me because I'm insane and > totally misunderstood what the HKDF is. > > Phrases are not the exact, but it should be correct enough. Some part > is my fault, w/o reading mail and/or poor English. I blindly assumed > you've understand RFC 5869 and possible common usages with PHP. I > apologized for the confusion and tried to explain why w/o success. > > If you still believe what you've said is correct, I don't request you to > take these back, show us at least one common/reasonable hash_hkdf() > usage example for current API, and point out what's wrong my > recommendations > and rationales above. > > If not, I request you to take it back. I respect your contributions much, > but > the last one you've said is out of tolerance. > > -- > Yasuo Ohgaki > yohgaki@ohgaki.net > > --001a1142dd30ced5790558a033cb--