Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:117923 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 74342 invoked from network); 13 Jun 2022 00:37:00 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 13 Jun 2022 00:37:00 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id E54DB180503 for ; Sun, 12 Jun 2022 19:24:07 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sun, 12 Jun 2022 19:24:07 -0700 (PDT) Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-30fdbe7467cso37855307b3.1 for ; Sun, 12 Jun 2022 19:24:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=colopl.co.jp; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=2WmoDK6P2GUDLH/7cPrA4fOAGu+MQaH42l05be7kV/I=; b=f4g4I4RY0GYloSPfKCgjzOg0U2HRDIioTxyLphjguhg6TSmaCbZyzOhriY9dYhr9Hl t0W5qSukqSX4fk6XdTsJVCxTZGIFweIg8pbKktzPZZG7R8ctn0cdADkfIAklbgrBpq7t ZCpODYRdNwv2pPk3fw9th+MmDC6DH0VTKm07N7Gt7oS8k0OkyBEclYPuVF84nG9mFXon JqFNULr3UYS7z3s3rlzu5RqrsGKqzqhjkhGndKuiDyXjmGF3Jv9jE5cZjiLs7dslVytV 2IOPCOdrhy0Il0Brw9q8RlH+QsqfpyBxn8sCWdUZK6iCi4Qp1NySroxSfLiVq11pu0Hd obXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=2WmoDK6P2GUDLH/7cPrA4fOAGu+MQaH42l05be7kV/I=; b=6S/hyaowFsQdn1pMPhEnSjBxOuwuq+caL7/HCBSnFTskx+eu/3gQE7uDsx0YhqDozX RIgMtceJl4Regwu8+JH4P6BZ90ByyzYMIdj53w4PWFM6kj2lXLqH+BMb+3eORvAVQGOs hoLk/T+FCUgSwt0JsXIz+rs21lrFtoCs6xGkEJAdtcl2B5sn875uVxucGqVj5hxawAaM yxAgHNlZONvXIkNGEoMywdKrrwwu0dpOyEr+ZlcNKqfj2t/jR0cV/8YiNPLQkoYvJgUJ N7Td2jIG3MC2XU3mpEr/6LldBvbSijK1sOujUlabXV7A+ZoLPCL3AbZwExoShRSX0U8p agXA== X-Gm-Message-State: AOAM5324XahmhqKhpvHInrC6ucXgXKdi3cBp/Og0Z6N48htnI/RwtHk5 tmsNfJSiI76FZJC4L0i8MagIQm0+7ezpbRhz054W/UPGIEUPmcQ= X-Google-Smtp-Source: ABdhPJwKGndyNUwV7RWFIJ4FNIqFfR9jp/9Ue+rNKf7WKCAXm621Pav+dFSBL8qv+7L929EgpBDTrXi3S6/KKvoUbQQ= X-Received: by 2002:a81:19d6:0:b0:313:551a:8e85 with SMTP id 205-20020a8119d6000000b00313551a8e85mr32594842ywz.136.1655087046722; Sun, 12 Jun 2022 19:24:06 -0700 (PDT) MIME-Version: 1.0 References: <77a67ec1-2b32-0e86-3714-9b2600691c18@bastelstu.be> <9c907fc8-ae1c-7c2f-c77a-727d03e70407@bastelstu.be> In-Reply-To: <9c907fc8-ae1c-7c2f-c77a-727d03e70407@bastelstu.be> Date: Mon, 13 Jun 2022 11:23:56 +0900 Message-ID: To: =?UTF-8?Q?Tim_D=C3=BCsterhus?= , internals@lists.php.net Content-Type: multipart/alternative; boundary="0000000000007e214e05e14afb9f" Subject: Re: [PHP-DEV] [RFC] [Vote] Pre-vote announcement for Random Extension 5.x From: g-kudo@colopl.co.jp (Go Kudo) --0000000000007e214e05e14afb9f Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable 2022=E5=B9=B46=E6=9C=8810=E6=97=A5(=E9=87=91) 19:42 Tim D=C3=BCsterhus : > Hi > > On 6/10/22 12:02, Go Kudo wrote: > >> It has a single generate(): string method that generates random number= s > > as a binary string. This string must be non-empty and attempting to > return > > an empty will result in a RuntimeException. > >> If you implement a random number generator in PHP, the generated numbe= rs > > must be converted to binary using the pack() function, and the values > must > > be little-endian. > > Thanks, that looks good to me. > > > * class Random Randomizer > > > >> The same current PHP algorithm is used to generate random numbers with= in > > the specified range in Randomizer::getInt(). This is necessary for > backward > > compatibility. > >> It also provides a guarantee of consistency in the future. > > > > However, I understand that perhaps this fix will not satisfy your > request. > > I just do not have a good understanding of your intentions due to my po= or > > English.... > > Don't worry. I understand that using a foreign language can sometimes be > hard - I am not a native speaker of English either and I suspect that > this also applies to many other participants. > > > I am considering the following possibilities regarding your intentions: > > > > 1. Should be stated in detail so that consistency of results is > maintained > > in the future. > > 2. Current PHP ranged random number generation algorithm has room for > > improvement and should be examined further > > 3. Consistency of results is difficult to maintain in the future (Maybe > > incorrect) > > > > In case 1, I think the following statement would satisfy the requiremen= t. > > > >> The values generated by the seedable Engine guarantee future > > reproducibility of results, and the Randomizer uses the results to > process > > them, so if the results generated by the Engine are identical, the > > Randomizer's results will also be consistent. > > > > Although the consistency of the Randomizer results is not mentioned her= e, > > it would be a clear BC Break if the results were to change after the > > Randomizer is officially implemented, so I do think it is sufficient th= at > > an RFC be created and voted on as necessary. > > If I understand you correctly, then (1) is what I am looking for: It > should be spelled out explicitly what behavior the user may rely on and > what should be considered an implementation detail. > > Something like the following would work for me: > > ---- > > The engines implement a specific well-defined random number generator. > For a given seed it is guaranteed that they return the same sequence as > the reference implementation. > > For the Randomizer it is considered a breaking change if the observable > behavior of the methods changes. For a given seeded engine and identical > method parameters the following must hold: > > - The number of calls to the Engine's ->generate() method remains the sam= e. > - The return value remains the same for a given result retrieved from > ->generate(). > > Any changes to the Randomizer that violate these guarantees require a > separate RFC. > > ---- > > > In case 2, I also thought about it along the way. Nikita also taught me > > about better algorithms: https://externals.io/message/115918#115982 > > > > But, I also think that the current PHP implementation is good enough, a= nd > > there is no need to change it to the point of breaking compatibility. > > > > I think the current global scope MT implementation is very troublesome > for > > some use cases and should first be able to be drop-in-replaceable with > this > > implementation. > > > > In case 3, I think it is necessary to guarantee consistency at least on= ce > > at the language level, even though this may change in the future. I hav= e > > already indicated the need for this in the RFC. > > Can you comment on whether the Randomizer behaves identically on both 32 > and 64 bit PHP and also on little endian and big endian architectures? > As an example: Will ->getInt() calculate the same "uniform distribution" > on all bitnesses? If not I consider that a bug. > > The engines *should* behave identically, because of the "little endian > string" return value. > > If that's already the case then something like the following should be > added to the RFC guarantees: > > - The implementation will guarantee that the same results are returned > independent of the processor architecture (little endian / big endian) > and integer bit length's (32 / 64). > > > Most of all, I do not believe you intend this to be the case. (And this > > sentence is not intended to offend you either. Please don't misundersta= nd > > me.) > > > > > Best regards > Tim D=C3=BCsterhus > It does not depend on endianness, but does depend on the number of bits. This is because the new algorithm generates 64-bit values. When using a 64-bit RNG in a 32-bit environment, Engine::generate() returns the same binary string as in a 64-bit environment, but the Randomizer methods return different values. This is because the size of zend_long in a 32-bit environment does not match uint64_t and is truncated. To keep the results the same in 32-bit / 64-bit environments, only the lower 32-bit of the 64-bit value should be used. However, this leads to reduced randomness and does not seem appropriate considering that most environments running PHP are 64-bit. I have created a PoC that allows all internal operations to be performed in 64-bit environments to achieve the same results, although the efficiency of execution in 32-bit environments will be reduced. (Note that Randomizer::getInt() with no argument is still incompatible.) https://github.com/php/php-src/commit/dbed218bfcd45e713fa3df2c88a4c2efce9f0= 651 Another idea I had was to throw an exception when trying to generate a 64-bit RNG in a 32-bit environment. Regards, Go Kudo --0000000000007e214e05e14afb9f--