Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:98250 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 71379 invoked from network); 7 Feb 2017 19:36:07 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 7 Feb 2017 19:36:07 -0000 Authentication-Results: pb1.pair.com smtp.mail=yohgaki@ohgaki.net; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=yohgaki@ohgaki.net; sender-id=pass Received-SPF: pass (pb1.pair.com: domain ohgaki.net designates 180.42.98.130 as permitted sender) X-PHP-List-Original-Sender: yohgaki@ohgaki.net X-Host-Fingerprint: 180.42.98.130 ns1.es-i.jp Received: from [180.42.98.130] ([180.42.98.130:34570] helo=es-i.jp) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 63/8C-03389-2A12A985 for ; Tue, 07 Feb 2017 14:36:04 -0500 Received: (qmail 113253 invoked by uid 89); 7 Feb 2017 19:35:58 -0000 Received: from unknown (HELO mail-qk0-f178.google.com) (yohgaki@ohgaki.net@209.85.220.178) by 0 with ESMTPA; 7 Feb 2017 19:35:58 -0000 Received: by mail-qk0-f178.google.com with SMTP id 11so99412591qkl.3 for ; Tue, 07 Feb 2017 11:35:58 -0800 (PST) X-Gm-Message-State: AMke39l5LTVrlE7qeDP1OVJCphkxaoDOop85SEJnANelMh1ms3AeqW5OfDq92HX/6onaHKkftNL8IV0ANSFevQ== X-Received: by 10.55.153.130 with SMTP id b124mr15661343qke.82.1486496151886; Tue, 07 Feb 2017 11:35:51 -0800 (PST) MIME-Version: 1.0 Received: by 10.140.19.232 with HTTP; Tue, 7 Feb 2017 11:35:11 -0800 (PST) Date: Wed, 8 Feb 2017 04:35:11 +0900 X-Gmail-Original-Message-ID: Message-ID: To: "internals@lists.php.net" , Andrey Andreev , Nikita Popov Content-Type: multipart/alternative; boundary=94eb2c07b19c18269a0547f5db1f Subject: hash_hkdf() signature From: yohgaki@ohgaki.net (Yasuo Ohgaki) --94eb2c07b19c18269a0547f5db1f Content-Type: text/plain; charset=UTF-8 Hi Nikita, Andrey and all, My apologies, I misread mails by super sloppy reading. I'll explain basis by my idea clearly and properly this time. This mail is long. Basis of my idea is - Salt is made to optional only for applications that such value is not available. (From RFC 5869) - Omitting salt could lead to security disaster. i.e. password leak. - Combined key, output key and salt, as final key(combined key) is common in many use cases. - Many HKDF applications with PHP must/should have salt for better implementation. - API should encourage "salt" use by its signature. (From RFC 5869) Ref: https://tools.ietf.org/html/rfc5869 Before restart discussion, there should be rationale for others. In theory, cryptographic hashes are cryptographically secure. Therefore, following operations should be considered as secure by definition. $new_key = sha1('some original key' . 'strong salt' . 'some info'); $signature = sha1('some data' . 'strong key'); However, in real world, people come up with better idea for cryptographic hashes. Sometimes people invent efficient way to attack cryptographic hashes. HMAC is known and proven method to generate more secure signature. Therefore, $signature = hash_hmac('sha1', 'some data', 'some key'); is secure even when cryptographic hash had minor defect(s). HKDF is made to generate secure new keys from existing key suitable for required operations by using HMAC. HKDF inputs - IKM, input key which may be weak or strong - salt, some entropy which makes HKDF stronger overall, may be secret or non secret/weak or strong. - info, which specifies HKDF contexts that are non secret usually. e.g. a protocol number, algorithm identifiers, user identities, etc. - length(L), output key length HKDF calculates output key(OKM) as follows Extract step PRK = HMAC-Hash(salt, IKM) This step is designed to make strong output key(OKM) and PRK always. OKM to be secure, either IKM or salt must be strong. Expand step N = ceil(L/HashLen) T = T(1) | T(2) | T(3) | ... | T(N) OKM = first L octets of T where: T(0) = empty string (zero length) T(1) = HMAC-Hash(PRK, T(0) | info | 0x01) T(2) = HMAC-Hash(PRK, T(1) | info | 0x02) T(3) = HMAC-Hash(PRK, T(2) | info | 0x03) Note: OKM is output key material that is return value from HKDF. This step is designed to make derived key have the length(L) from strong key(PRK) generated by Extract step. Key context(info) is distinguished by this step also. Both salt and info is optional, but RFC 5869 states differently. "HKDF is defined to operate with and without random salt. This is done to accommodate applications where a salt value is not available. We stress, however, that the use of salt adds significantly to the strength of HKDF, ensuring independence between different uses of the hash function, supporting "source-independent" extraction, and strengthening the analytical results that back the HKDF design." This statement implies salt is almost mandatory parameter for HKDF when salt can be used. In contrast, info is described as pure optional parameter for key context. "While the 'info' value is optional in the definition of HKDF, it is often of great importance in applications. Its main objective is to bind the derived key material to application- and context-specific information. For example, 'info' may contain a protocol number, algorithm identifiers, user identities, etc." With regard to mandatoriness parameters, strong salt is mandatory to derive cryptographically strong key when input key is weak, while info/length is optional always. In addition, it is common that salt being used as a part of combined keys. Salt is mandatory for such applications. There are many authentication implementations that use - key which does not disclose original key by hashing - nonce(salt) Access permission with timeout is another typical usage that requires timestamp as salt and combined key. HKDF can produce secure key, which protects input key(IKM) and generates strong output key (OKM), but this is true only when IKM or salt is strong. Use of weak IKM and salt could lead password leak. For above reasons, I'm proposing change string hash_hkdf(string $hash, string $ikm [, int $length=0 [, string $info='' [, string $salt='']]]) to string hash_hkdf(string $hash, string $ikm, string $salt [, string $info='' [, int $length=0 ]]) - To omit salt, $salt=NULL. $salt='' raise exception. On Mon, Jan 16, 2017 at 8:08 PM, Nikita Popov wrote: > Making the salt required makes no sense to me. > > HKDF has a number of different applications: > a) Derive multiple strong keys from strong keying material. Typical case > for this is deriving independent encryption and authentication keys from a > master key. This requires only specification of $length. A salt is neither > necessary nor useful in this case, because you start with strong > cryptographic keying material. > Very true. Shorter password would hide IKM/salt/info state, so shorter output key could be said more secure. e.g. SHA512 and 32 bytes output key This is only applicable when IKM and/or salt is strong, though. If we could assume IKM to be always cryptographic Extract step PRK = HMAC-Hash(salt, IKM) is not needed, but only T(1) = HMAC-Hash(PRK, T(0) | info | 0x01) In short, if key is strong, hash_hmac('sha256', $key, 1) substr(hash_hmac('sha256', $key, 1), 0, $length) is good enough. Thus, HKDF is not needed. But in real world, IKM does not have to be cryptographically secure. Therefore, HKDF users must ensure secure PRK by PRK = HMAC-Hash(salt, IKM) This requirement makes "salt" more important than "length" because strong PRK is mandatory for HKDF security. i.e. IKM or salt must be strong always. For the same reason as "Apply HTML escape for all vars", random salt regardless of key strength whenever possible, is safer. It prevents weak IKM leak by misuse effectively. b) Generating per-session (or similar) keys from a (strong cryptographic) > master key. For this purpose you can specify the $info parameter. again, a > salt is neither necessary nor useful in this case. (You could probably also > use $salt instead of $info in this case, but the design of the function > implies that $info should be used for this purpose.) > Assuming $_SESSION['master_key'] = random_bytes(80); // Keep this for this session. and current hash_hkdf() function signature string hash_hkdf(string $hash_function, string $ikm, [int $length [, string $info [, string $salt]]]); Your idea might be $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0, session_id()); Although it works. Session ID is not identity, but key that identifies user's connection. Key(entropy) value for info parameter violates the RFC recommended info usage. OR You might be assuming logged in session, then username can be used as info parameter. $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0, $_SESSION['username']); username is non secret user identifier. This usage matches RFC recommended usage. However, username is only available for logged in session, when username is changed during session it stops working. It stops working at logout also. And most important of all, this is not a per-session, but per-user... The RFC compliant implementation with PHP is to use session_id() as secret salt. $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0, '', session_id()); This method is better because it works regardless of authentication/changed username. However, this version still has problem with regenerated session ID. This could be fixed with better salt. $_SESSION['salt'] = session_create_id(); then $new_key = hash_hkdf('sha256', $_SESSION['master_key'], 0, '', $_SESSION['salt']); This works for any session always. Both $new_key and $_SESSION['salt'] could be safely passed as page content unlike session_id() as salt. I have other examples like these that illustrates, users should consider salt usage. c) Extracting strong cryptographic keying material from weak cryptographic > keying material. Standard example here is extracting strong keys from DH > g^xy values (which are non-uniform) and similar. This is the usage that > benefits from a $salt. > Even when strong IKM is used, low entropy salt like timestamp can be used as combined key. This use case would be one of the most used with PHP. Remember that HKDF is an extract-and-expand algorithm, and the extract step > (which uses the salt) is only necessary if the input keying material is > weak. We always include the extract step for compatibility with the overall > HKDF construction (per the RFCs recommendation), but it's essentially just > an unnecessary operation if you work on strong keying material. > I presume reply to a) should be sufficient for this. Strong(cryptographic) key, e.g. random_bytes(80), can be used in many cases. Even when key supposed to be strong key, it would be better to use salt to derive even stronger key. This practice will prevent users to omit salt for weak input keys accidentally. Slat could be used as combined key also. The only thing that we may want to discuss is whether we should swap the > $info and the $salt parameters. This depends on which usage (b or c) we > consider more likely. "length" has little relevance with respect to IKM protection and output key (OKM) security. While it is user's responsibility to ensure secure PRK and protect IKM by Extract step PRK = HMAC-Hash(salt, IKM) HKDF info parameter is unrelated to IKM protection and output key as per the RFC, but salt has the responsibility. Therefore, "salt" must have priority over "info" and "length", IMHO. Importance with regard to security: salt >>>> info > length We also must consider how output key and salt is used in real world PHP applications. There are applications that use output key and salt as combined key. e.g. authentication, access key with expiration // URL access key with expiration $expire = time() + 90; // 90 sec timeout. Low entropy salt is allowed with strong IKM. $key = hash_hkdf('sha256', $_SESSION['strong_master_key'], 0, $URL, $expire); // Send $key and $expire as combined key for $URL hash_hkdf() is key generation function and output key and salt are used as "combined key" in many use cases. Parameter being as key is important. Therefore, commonly used combined key(salt) is better to locate after IKM. Importance with regard to common use case: salt >> info > length Although there are HKDF usage without salt, many HKDF applications with PHP require or are better with salt. e.g. Previous per-session encryption. Developers will develop better application if they consider how salt could be used. Therefore, salt is better to be required parameter and omit it only when salt cannot be used. Importance with regard to education: salt >>> info > length (User must learn safe salt usage) "length" is the least for me. "salt" and "info" has important effect for derived key/input key security. These 2 should have priority over "length". Making the most responsible/sensitive parameter(salt) which is currently optional to required parameter should not be a issue for users. If it is C, I don't care and accept whatever signature. C is full of pit holes already. I just don't want to see news, "Passwords are stolen from PHP app!". hash_hkdf() is could be misused easily like hash_hkdf('sha256', $weak_ikm, 9); // We can generate strong key easily, Nice! hash_hkdf('sha256', $weak_ikm, 9, 'Super User Only'); // Safe key for super user hash_hkdf('sha256', $strong_ikm, 4); // Secure and nice password for super secret These are security disaster. If salt is required, users would always think about it at least. Length should not be shortened unless user is absolutely sure. If there are unclear sentences, please let me know. Thank you for reading long mail! Regards, P.S. I'll be more careful, but I become very sloppy mail reader sometimes. I appreciate if you could let know via private email. Thank you! -- Yasuo Ohgaki yohgaki@ohgaki.net --94eb2c07b19c18269a0547f5db1f--