Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:112786 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 44488 invoked from network); 6 Jan 2021 17:20:36 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 6 Jan 2021 17:20:36 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 0DE981804F3 for ; Wed, 6 Jan 2021 08:57:05 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-ej1-f46.google.com (mail-ej1-f46.google.com [209.85.218.46]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 6 Jan 2021 08:57:04 -0800 (PST) Received: by mail-ej1-f46.google.com with SMTP id lt17so5993637ejb.3 for ; Wed, 06 Jan 2021 08:57:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=Du/pAD1dCRfZkm80kfdxIbQimCNKvG8Pyuv8m/XtfEM=; b=R/IRfdT2lJy0RXwuGDkuIX5XrSXgrBDR2mNS0DdlCEfsah4OSVjT3F+f1wyAlmILz0 zAL0syU65obv8hEHsyotmesB+DXH4pxr0wmd39p+TWy38efbgVe74xIMwQixPv8JkUEx jxxpLgnnHmRvWG36lixneuDEv/6ZOqnl5vcfUB2XjG9EkeWly//mYiSRmw+UrBqIR+8z +gtONgd0bnSddHfaV9UiOBh9EImJgq4Oa3S+Q2Sp8UBzrCS79ZLvAEDklxjwrB/GY6ZD QmEUClS7llEQV72pVMNAAjwW/7KryT8ekTKwn28OXfeN6zOOjsO9+y9AZSyJdEE/R0Wk MHPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Du/pAD1dCRfZkm80kfdxIbQimCNKvG8Pyuv8m/XtfEM=; b=nLx7IfqB421Ge3BOd1v3zo0athFVraWbOVQ+oGne+u2Zw1BFw5FptIeKqFM9lksybH Rc2K42HSGA39Vv58o0lmolPwwDR3qbvANp1qFBBqGizm/9do5kb7DkVs2DlicWFldmaf bTpRTXdKU54NLDjOExK2UCLaK5nmgK7Pmwi8eHMxn4A16sP9fHpf8XbDY4X5YyTMf/t4 LjvrF/D0Ac1VGA7Lb8Qia3ZbixplSwF4dT47OUWQGAnnMW6Aot7MEkD9/N7WwDe61zk7 jPwgshGzNWnDG7E4oCGWLvkb1sB4K+kyj0z38rog+DRttr4zcMBOn0jgDeEwXVzkYoEb 1a8g== X-Gm-Message-State: AOAM5308YuSE3BWpm/kJGRRr/4lhRnF8KE3xHq/5Uh+4wn9IDY3rqcL1 qPUw59wnGuJm9pH6nPfgzlDzzyY9/V4VO0dtfTQ= X-Google-Smtp-Source: ABdhPJwg/fhrdNjwm1ONcd+efDpFOyymN+e1QuZwgwn2g/yu4yXQfvK63TKrITmcRs5zw0bmX67l0eX7KdjeKFVDw6o= X-Received: by 2002:a17:906:4058:: with SMTP id y24mr3322841ejj.245.1609952220698; Wed, 06 Jan 2021 08:57:00 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 7 Jan 2021 01:56:49 +0900 Message-ID: To: Nikita Popov , internals@lists.php.net Content-Type: multipart/alternative; boundary="0000000000003862be05b83e36ba" Subject: Re: [PHP-DEV] Re: Improving PRNG implementation. From: zeriyoshi@gmail.com (Go Kudo) --0000000000003862be05b83e36ba Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello Nikita. > In addition to the functions you already list, I'd add > function rng_bytes(RNGInterface $rng, int $length): string; > function rng_between(RNGInterface $rng, int $min, int $max): int; Certainly these are useful. I've added this to the RFC in TypeC. > Just to share another possibility, we could also modify existing functions to optionally accept RNGInterface, like > shuffle(array &$array, ?RNGInterface $rng =3D null): bool; This approach may be appropriate. This way, the user is implicitly aware that he is using a global state RNG. > What do you mean by "use unintended instances" here? For example, if you write the following code, the user will receive an unintended result. ```php $rng1 =3D new \RNG\XorShift128Plus(1234); $rng2 =3D new \RNG\MT19937(1234); $arr1 =3D range(1, 100); $arr2 =3D range(1, 100); rng_shuffle($rng1, $arr1); rng_shuffle($rng1, $arr2); // oops, this case uses $rng1 RNG instance. ``` I believe that if these features are provided by class methods, such mistakes can be avoided. However, it is a matter of likelihood and similar mistakes can happen anyway. > Regarding implementation complexity, yes, this version is a bit more involved if we want to allow userland implementations of RNGInterface. To be honest I don't particularly care about supporting userland (we could also only allow it to be implemented internally), but I think it shouldn't be too hard either. Yes, it's probably possible. I need to understand the PHP implementation more deeply :) If you know of any implementations that might be helpful, I would like to know about them. As for the implementation of userland, I don't put that much emphasis on it either. I just use classes for clarity, and I'm fine only provided by the core or extensions. > This is generally the part I'm least sure about. I agree that it would be good to leverage 64-bit numbers on systems that support them. On the other hand, I also think that having a consistent random number stream between 32-bit and 64-bit systems may be important. For example, if you use this functionality for predictable "randomness" in unit tests, then it would be a problem if your library could only be tested on 32-bit or only on 64-bit. Certainly, this seems to be a problem. However, the 32bit architecture environment is very limited, and a 64bit RNG can be useful for applications and libraries that are designed to run on 64bit architecture. Also, some of the new RNGs that we are trying to implement, such as XorShift128+, have an internal state of uint64_t. The random numbers they generate are always 64bit, and rounding them to 32bit may prevent their intended use. For these reasons, I think that we should provide a `next64()` method. However, care must be taken in serializing the instance. We are trying to implement 32bit environment-safe serialization in the current PR, but it is not smart to say the least. https://github.com/php/php-src/pull/6568/files#diff-0f1c13e606f11fb5da8dc09= 1109852a6dec328e5aea2389091ff839510cf23faR275 Based on the above discussion, I have added TypeC2 to the RFC. How about this? https://wiki.php.net/rfc/object_scope_prng Regards, Go Kudo 2021=E5=B9=B41=E6=9C=886=E6=97=A5(=E6=B0=B4) 19:50 Nikita Popov : > On Tue, Jan 5, 2021 at 6:40 PM Go Kudo wrote: > >> Thanks Nikita. >> >> Thanks for the input. This looks like a very smart solution. >> I've edited the RFC and added a new proposal as TypeC. >> >> https://wiki.php.net/rfc/object_scope_prng >> > > In addition to the functions you already list, I'd add > > function rng_bytes(RNGInterface $rng, int $length): string; > function rng_between(RNGInterface $rng, int $min, int $max): int; > > Similar to your type B proposal. > > Just to share another possibility, we could also modify existing function= s > to optionally accept RNGInterface, like > > shuffle(array &$array, ?RNGInterface $rng =3D null): bool; > > However, I'm not sure this is better than adding a separate family of > functions. > > >> However, I have a couple of concerns. The first is that users will have >> to be careful with RNG instances (for better or worse, TypeA will not al= low >> users to use unintended instances), and the second is that it will >> complicate the implementation of PHP functions. >> > > What do you mean by "use unintended instances" here? > > Regarding implementation complexity, yes, this version is a bit more > involved if we want to allow userland implementations of RNGInterface. To > be honest I don't particularly care about supporting userland (we could > also only allow it to be implemented internally), but I think it shouldn'= t > be too hard either. > > >> Also, given the current environment in which PHP runs, I think that >> random number generation in the 64bit range should be supported. >> How about including the `next64()` method in the interface? >> > > This is generally the part I'm least sure about. I agree that it would be > good to leverage 64-bit numbers on systems that support them. On the othe= r > hand, I also think that having a consistent random number stream between > 32-bit and 64-bit systems may be important. For example, if you use this > functionality for predictable "randomness" in unit tests, then it would b= e > a problem if your library could only be tested on 32-bit or only on 64-bi= t. > > As long as an API like rng_between($rng, $min, $max) is used, we can > always assemble two 32-bit values into a random number that is larger tha= n > 32-bit. But of course, fetching a single 64-bit number is more efficient.= .. > > Nikita > > >> 2021=E5=B9=B41=E6=9C=885=E6=97=A5(=E7=81=AB) 23:53 Nikita Popov : >> >>> On Thu, Dec 24, 2020 at 5:12 PM zeriyoshi wrote: >>> >>>> I updated the RFC draft and changed it to a proposal to bifurcate the >>>> interface. >>>> >>>> https://wiki.php.net/rfc/object_scope_prng >>>> >>>> At the same time, I was looking into the RNG problem in Swoole and fou= nd >>>> out that the problem was actually occurring. >>>> >>>> https://www.easyswoole.com/En/Other/random.html >>>> >>>> The above problem with Swoole is only for child processes and can be >>>> fixed >>>> by the user, but as mentioned above, I think it will become more >>>> serious as >>>> PHP becomes more complex in the future. >>>> I hadn't thought of this before, but we might want to consider >>>> deprecating >>>> existing state-dependent RNG functions. >>>> >>>> I am seeking feedback on this proposal in order to take the RFC to the >>>> next >>>> step. >>>> Thank you in advance. >>>> >>>> Regards, >>>> Go Kudo >>>> >>> >>> I think the general direction of your second proposal (B) is the right >>> one, but I'm concerned about some of the details. >>> >>> First and foremost, I don't think that RNG should extend Iterator. Just >>> having a single "next()" method or similar is sufficient. The Iterator >>> interface provides many additional things that don't really make sense = in >>> this context and only complicate the implementation. We should clearly >>> specify the value range of next() though. I assume that for portability >>> reasons (ensure same sequence on 32-bit and 64-bit) this would have to = be a >>> sign-extended 32-bit value. >>> >>> The type B proposal distinguishes between a RNG and a PRNG interface. I= s >>> this really useful? I don't think the value of the "getSeed()" method i= s >>> high enough to justify the split interface. >>> >>> The RNGUtil class has methods like: >>> >>> public static function shuffleArray(int $randomNumber, array $arr): >>> array; >>> >>> This doesn't make sense to me, as a single random number is not enough >>> to shuffle a whole array :) Instead this should be accepting the RNG an= d >>> perform an internal call to next() (of course, with a fast by-pass >>> mechanism if the RNG is implemented internally): >>> >>> public static function shuffleArray(RNG $rng, array $arr): array; >>> >>> I'm also wondering why this is a static class rather than a set of >>> free-standing functions. Why RNGUtil::shuffleArray() rather than >>> rng_shuffle_array()? >>> >>> Regards, >>> Nikita >>> >>> 2020=E5=B9=B412=E6=9C=8816=E6=97=A5(=E6=B0=B4) 23:46 zeriyoshi : >>>> >>>> > Nice to meet you, internals. >>>> > >>>> > PHP 8.0 has been released. With the inclusion of JIT, PHP is about t= o >>>> be >>>> > extended beyond the web. >>>> > >>>> > So I'd like to make a few suggestions. >>>> > >>>> > First , PHP has the historical Mersenne Twister PRNG. However, this >>>> > implementation keeps its state in a global and cannot be handled as = an >>>> > object like other languages (e.g. Java). >>>> > >>>> > So, I created a PHP Extension and proposed it to PECL. >>>> > >>>> > https://marc.info/?l=3Dpecl-dev&m=3D160795415604102&w=3D2 >>>> > https://github.com/zeriyoshi/php-ext-orng >>>> > >>>> > But, Then I looked at the mailing list archives and noticed that a >>>> similar >>>> > proposal had been made before. >>>> > >>>> > https://externals.io/message/98021#98130 >>>> > >>>> > I feel that this suggestion is needed now to expand PHP beyond the >>>> web. >>>> > >>>> > Second suggestion is to stop using the Combined LCG as the default >>>> seed >>>> > value for each function. >>>> > >>>> > PHP's Combined LCG only uses PID (or ZTS Thread ID) and time as >>>> entropy. >>>> > https://github.com/php/php-src/blob/master/ext/standard/lcg.c#L72 >>>> > >>>> > With the development of container technology, this problem seems to = be >>>> > getting more serious. So I think we should use the random numbers >>>> provided >>>> > by the OS (getrandom on Linux) if available. >>>> > >>>> > I would like to hear your opinions. >>>> > >>>> > Regards >>>> > Go Kudo >>>> > >>>> >>> --0000000000003862be05b83e36ba--