Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:115340 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 46689 invoked from network); 7 Jul 2021 10:02:10 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 7 Jul 2021 10:02:10 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id BFC2D18053C for ; Wed, 7 Jul 2021 03:24:08 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com [209.85.208.177]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 7 Jul 2021 03:24:08 -0700 (PDT) Received: by mail-lj1-f177.google.com with SMTP id r20so2024684ljd.10 for ; Wed, 07 Jul 2021 03:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BOKvCVQEDN7554PSHYPB2xYStFIGHcq5wNiP01yEc0w=; b=bG4mEoIC9AWx7vnK0bT63jHQGy5vtfYfIMWli626lZqE8qMIqvheRePaxrYNZO0R46 X/10zPoCm8d5Ile55BP87hOldSuZRDWQDJUIHeIbFP/N4iXfmWFuxspnLlICb5joag9q K1LZfceuHiRF1un4uKHE7Xmyebbk6ijdZRExvo9BoFLrL1XFkGwiGzeQEEnfCauoXvOQ D5vG7S2dhzbTAY6u1OsTqNkCDbhvcjEcYEXZR8IHW0DIEIjsWiZxgfeveQwAsTRSrS7f b+6KPkekbh06QLox5ioIBXpS85Kpd8rIoNt7Opj3bccV4WKVdJN0Ru8NehNtubD/qKC5 dS4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BOKvCVQEDN7554PSHYPB2xYStFIGHcq5wNiP01yEc0w=; b=f9OeMWGP77qw2l1C0JWF5LmhVsazbxuIR6bkO+DFiVnOU2KrA+yZUTDz0krEEEDoia hGAAofBBx0gxuwsaHhSGhAuhWCAvRPrHL6NjW2jSnhzr9Gl9pwQ0quguzFxKDinW8vqW O+dX/ljZUNOTLCjXdFCMWKtosI+3mTDWvlyN7ryGvCJsI43tvnlSLFKBj7B4NSC9xgAi 6FNxrYAh6/pi95fDbtnAluKYt1Knv2e7cJ0QhSGNx+UuUGP0zSmgB7Xep8vjkQUaVqTS ySNti9ILUZwPaH4gOQMborfalyD3sk3lqzRD+EUHpUX4kLul7KzVIWiVSSld8T75ioBJ b/iw== X-Gm-Message-State: AOAM533ZQlhEtIEoduqEg/cUzzi/EiCbPh10sEsx/7NHEA9YuZ+tz+sX XWzjdyOaX7HzW/O9bhPE74oShgV0XjfnLpjoFMk= X-Google-Smtp-Source: ABdhPJx6HeqO3Pcr+PeJZAbvBrXo1xQ0yYEI+zKWQWQFpk1IAhu6KImJeSkHgbiMn51j+LmEWCDMKO+k2aT7ImPTZbc= X-Received: by 2002:a2e:a261:: with SMTP id k1mr18675306ljm.452.1625653446575; Wed, 07 Jul 2021 03:24:06 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Wed, 7 Jul 2021 12:23:50 +0200 Message-ID: To: Go Kudo Cc: PHP internals Content-Type: multipart/alternative; boundary="00000000000036092205c685f05b" Subject: Re: [PHP-DEV] [RFC] Add Random Extension (before: Add Random class) From: nikita.ppv@gmail.com (Nikita Popov) --00000000000036092205c685f05b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Jul 6, 2021 at 4:54 PM Go Kudo wrote: > > Could you share some example of where you use it? > > It looks like mt_rand() could be replaced by mt_rand(0, getrandmax()), bu= t > that is not the case, mt_rand() with a specified range is an implementati= on > that generates random numbers until the desired value is obtained, which > may unintentionally advance the random state. This can be undesirable for > pseudo-random numbers with periodicity. > > Also, shouldn't compatibility with mt_rand() be maintained? Currently, > NumberGenerator\MT19937 is fully compatible with mt_rand(), do We need to > drop it? > > > $random->getNumberGenerator()->generate() to access the raw RNG stream. > > Indeed. I think nextInt() can be removed from Random, since it currently > returns exactly the same value. (However, I think > NumberGenerator::generate() should be kept). > Yes, I think that would be best. As the value returned by nextInt() is already available through generate(), and use cases for nextInt() probably aren't common, I don't think we need the separate nextInt() method. We can always add it later if it turns out to be a common requirement (but not the other way around...) Regards, Nikita > 2021=E5=B9=B47=E6=9C=886=E6=97=A5(=E7=81=AB) 23:35 Nikita Popov : > >> On Fri, Jul 2, 2021 at 3:58 PM Go Kudo wrote: >> >>> > * The first bit is just clarification. After a cursory look at the >>> implementation, my understanding is that the getInt(), shuffleArray() a= nd >>> shuffleString() APIs will always produce consistent results on 32-bit a= nd >>> 64-bit, as long as your inputs are 32-bit as well (i.e., min/max are 32= -bit >>> and string is smaller than 4G). Is that correct? The only APIs that wou= ld >>> exhibit different behavior are nextInt() and getBytes(), right? >>> >>> Yes. I do not want to break the compatibility of the implementation. I >>> would prefer to be able to migrate code that uses the current internal >>> state. >>> >>> > * Looking at the implementation, nextInt() performs a >> 1 operation >>> on the RNG result. I assume the motivation is to get back a non-negativ= e >>> number. But why do we want that? The "nextInt()" name doesn't really >>> indicate that it's a positive number. I think more generally, my questi= on >>> here may be "Why does this method exist at all? When would you use it >>> instead of getInt()?" >>> >>> This was to allow for future forward compatibility. When PHP_INT_SIZE >>> exceeds 8, the result will be incompatible without bit shifting. This i= s >>> similar to the way mt_rand() does bit shifting now. >>> >>> However, I can agree that such a day will never come in reality. And as >>> the comments on GitHub show, there are ways to keep the values compatib= le >>> even if such a time comes. >>> >>> After thinking about it for a while, I finally came to the conclusion >>> that there is no benefit to this other than to make mt_rand() and >>> Random\NumberGenerator\MT19937 directly compatible. >>> If compatibility is needed, it can be achieved by bit shifting in the >>> PHP code, so direct compatibility is probably unnecessary. I will chang= e >>> the implementation and remove this option. >>> >>> > "Why does this method exist at all? When would you use it instead of >>> getInt()?" >>> >>> The case for this would be if you want to get a raw unrounded random >>> number sequence as a number. The situations where this is required woul= d >>> certainly be limited, but it would be nice to have. (At least, I know o= f >>> several production codes that use the result of mt_rand() with no >>> arguments.) >>> >> >> Could you share some example of where you use it? Maybe that will help >> understand the motivation for it. >> >> Also, I think it's worth pointing out that it's always possible to use >> $random->getNumberGenerator()->generate() to access the raw RNG stream. >> >> Regards, >> Nikita >> >> > * I don't really get why we need RandomInterface. I think if the >>> choice is between "final + interface" and "non-final without interface"= , >>> I'd prefer the latter (though I'm also happy with "final without >>> interface"). >>> >>> I had completely lost my train of thought on this. The interface makes >>> the Random class unextensible. I have removed this. >>> >>> > I'm not entirely happy with the naming. Unfortunately, I don't have >>> great suggestions either. I think in your current hierarchy, I would ma= ke >>> the interface Random\NumberGenerator (with implementation in the >>> sub-namespace), rather than Random >>> \NumberGenerator\RandomNumberGenerator. >>> >>> Deep-rooted problem. For now, I'm going to change RandomNumberGenerator >>> to Random\NumberGenerator. It's the best one so far. >>> >>> >>> I continue to be plagued by Valgrind warnings and crashes of Windows ZT= S >>> builds... >>> I'd like to make a voting phase that is fixed ... >>> >>> Regards, >>> Go Kudo >>> >>> 2021=E5=B9=B46=E6=9C=8829=E6=97=A5(=E7=81=AB) 23:01 Nikita Popov : >>> >>>> On Sat, Jun 26, 2021 at 2:40 AM Go Kudo wrote: >>>> >>>>> Hello Internals. >>>>> >>>>> RFC has been reorganized for finalization. >>>>> >>>>> https://wiki.php.net/rfc/rng_extension >>>>> >>>>> The changes from the previous version are as follows: >>>>> >>>>> - Changed again to a class-based approach. The argument can be >>>>> omitted, in >>>>> which case an instance of XorShift128Plus will be created >>>>> automatically. >>>>> - Future scope was specified in the RFC and the functionality was >>>>> separated >>>>> as a Random extension. >>>>> - Changed to separate it as a Random extension and use the appropriat= e >>>>> namespace. >>>>> - In order to extend the versatility of the final class, Random, a >>>>> RandomInterface has been added, similar in approach to the >>>>> DateTimeInterface. >>>>> >>>> >>>> The updated proposal looks quite nice :) I think this is close to done= . >>>> Some small bits of feedback: >>>> >>>> * The first bit is just clarification. After a cursory look at the >>>> implementation, my understanding is that the getInt(), shuffleArray() = and >>>> shuffleString() APIs will always produce consistent results on 32-bit = and >>>> 64-bit, as long as your inputs are 32-bit as well (i.e., min/max are 3= 2-bit >>>> and string is smaller than 4G). Is that correct? The only APIs that wo= uld >>>> exhibit different behavior are nextInt() and getBytes(), right? >>>> * Looking at the implementation, nextInt() performs a >> 1 operation >>>> on the RNG result. I assume the motivation is to get back a non-negati= ve >>>> number. But why do we want that? The "nextInt()" name doesn't really >>>> indicate that it's a positive number. I think more generally, my quest= ion >>>> here may be "Why does this method exist at all? When would you use it >>>> instead of getInt()?" >>>> * Another bit of clarification: For the user-defined RNG, which range >>>> is generate() expected to return? I assume that it must return the nat= ive >>>> integer size, i.e. 32-bit on 32-bit and 64-bit on 64-bit? >>>> * I don't really get why we need RandomInterface. I think if the >>>> choice is between "final + interface" and "non-final without interface= ", >>>> I'd prefer the latter (though I'm also happy with "final without >>>> interface"). >>>> * I'm not entirely happy with the naming. Unfortunately, I don't have >>>> great suggestions either. I think in your current hierarchy, I would m= ake >>>> the interface Random\NumberGenerator (with implementation in the >>>> sub-namespace), rather than Random\NumberGenerator\RandomNumberGenerat= or. >>>> >>>> Regards, >>>> Nikita >>>> >>>> I've done a tidy implementation to make this final, but I'm currently >>>>> suffering from error detection by Valgrind for unknown reasons. >>>>> >>>>> Implementation is here: https://github.com/php/php-src/pull/7079 >>>>> >>>>> This can be reproduced with the following code. >>>>> >>>>> ```sh >>>>> # Success >>>>> $ valgrind ./sapi/cli/php -r '$random =3D new Random(); >>>>> $random->nextInt();' >>>>> =3D=3D95522=3D=3D Memcheck, a memory error detector >>>>> =3D=3D95522=3D=3D Copyright (C) 2002-2017, and GNU GPL'd, by Julian S= eward et >>>>> al. >>>>> =3D=3D95522=3D=3D Using Valgrind-3.14.0 and LibVEX; rerun with -h for >>>>> copyright info >>>>> =3D=3D95522=3D=3D Command: ./sapi/cli/php -r $random\ =3D\ new\ Rando= m();\ >>>>> $random-\>nextInt(); >>>>> =3D=3D95522=3D=3D >>>>> =3D=3D95522=3D=3D >>>>> =3D=3D95522=3D=3D HEAP SUMMARY: >>>>> =3D=3D95522=3D=3D in use at exit: 1,286 bytes in 32 blocks >>>>> =3D=3D95522=3D=3D total heap usage: 28,445 allocs, 28,413 frees, 4,= 333,047 >>>>> bytes >>>>> allocated >>>>> =3D=3D95522=3D=3D >>>>> =3D=3D95522=3D=3D LEAK SUMMARY: >>>>> =3D=3D95522=3D=3D definitely lost: 0 bytes in 0 blocks >>>>> =3D=3D95522=3D=3D indirectly lost: 0 bytes in 0 blocks >>>>> =3D=3D95522=3D=3D possibly lost: 0 bytes in 0 blocks >>>>> =3D=3D95522=3D=3D still reachable: 1,286 bytes in 32 blocks >>>>> =3D=3D95522=3D=3D suppressed: 0 bytes in 0 blocks >>>>> =3D=3D95522=3D=3D Rerun with --leak-check=3Dfull to see details of le= aked memory >>>>> =3D=3D95522=3D=3D >>>>> =3D=3D95522=3D=3D For counts of detected and suppressed errors, rerun= with: -v >>>>> =3D=3D95522=3D=3D ERROR SUMMARY: 0 errors from 0 contexts (suppressed= : 0 from >>>>> 0) >>>>> >>>>> # Fail >>>>> $ valgrind ./sapi/cli/php -r '$random =3D new Random(); >>>>> $random->nextInt() >>>>> =3D=3D=3D $random->nextInt();' >>>>> =3D=3D95395=3D=3D Memcheck, a memory error detector >>>>> =3D=3D95395=3D=3D Copyright (C) 2002-2017, and GNU GPL'd, by Julian S= eward et >>>>> al. >>>>> =3D=3D95395=3D=3D Using Valgrind-3.14.0 and LibVEX; rerun with -h for >>>>> copyright info >>>>> =3D=3D95395=3D=3D Command: ./sapi/cli/php -r $random\ =3D\ new\ Rando= m();\ >>>>> $random-\>nextInt()\ =3D=3D=3D\ $random-\>nextInt(); >>>>> =3D=3D95395=3D=3D >>>>> =3D=3D95395=3D=3D Conditional jump or move depends on uninitialised v= alue(s) >>>>> =3D=3D95395=3D=3D at 0x966925: ZEND_IS_IDENTICAL_SPEC_VAR_VAR_HAND= LER >>>>> (zend_vm_execute.h:27024) >>>>> =3D=3D95395=3D=3D by 0x99AC27: execute_ex (zend_vm_execute.h:57236= ) >>>>> =3D=3D95395=3D=3D by 0x99C902: zend_execute (zend_vm_execute.h:590= 26) >>>>> =3D=3D95395=3D=3D by 0x8DB6B4: zend_eval_stringl (zend_execute_API= .c:1191) >>>>> =3D=3D95395=3D=3D by 0x8DB861: zend_eval_stringl_ex >>>>> (zend_execute_API.c:1233) >>>>> =3D=3D95395=3D=3D by 0x8DB8D6: zend_eval_string_ex (zend_execute_A= PI.c:1243) >>>>> =3D=3D95395=3D=3D by 0xA4DAE4: do_cli (php_cli.c:995) >>>>> =3D=3D95395=3D=3D by 0xA4E8E2: main (php_cli.c:1366) >>>>> =3D=3D95395=3D=3D >>>>> =3D=3D95395=3D=3D >>>>> =3D=3D95395=3D=3D HEAP SUMMARY: >>>>> =3D=3D95395=3D=3D in use at exit: 1,286 bytes in 32 blocks >>>>> =3D=3D95395=3D=3D total heap usage: 28,445 allocs, 28,413 frees, 4,= 333,070 >>>>> bytes >>>>> allocated >>>>> =3D=3D95395=3D=3D >>>>> =3D=3D95395=3D=3D LEAK SUMMARY: >>>>> =3D=3D95395=3D=3D definitely lost: 0 bytes in 0 blocks >>>>> =3D=3D95395=3D=3D indirectly lost: 0 bytes in 0 blocks >>>>> =3D=3D95395=3D=3D possibly lost: 0 bytes in 0 blocks >>>>> =3D=3D95395=3D=3D still reachable: 1,286 bytes in 32 blocks >>>>> =3D=3D95395=3D=3D suppressed: 0 bytes in 0 blocks >>>>> =3D=3D95395=3D=3D Rerun with --leak-check=3Dfull to see details of le= aked memory >>>>> =3D=3D95395=3D=3D >>>>> =3D=3D95395=3D=3D For counts of detected and suppressed errors, rerun= with: -v >>>>> =3D=3D95395=3D=3D Use --track-origins=3Dyes to see where uninitialise= d values >>>>> come >>>>> from >>>>> =3D=3D95395=3D=3D ERROR SUMMARY: 1 errors from 1 contexts (suppressed= : 0 from >>>>> 0) >>>>> ``` >>>>> >>>>> However, the detection is internal to the Zend VM and the cause has n= ot >>>>> been identified. From the code, it looks like memory management is >>>>> being >>>>> done properly. >>>>> >>>>> I have a somewhat tricky way of allocating memory to make the process >>>>> common, do I need to give some hints to Valgrind? >>>>> >>>>> If you know, I would appreciate your advice on this issue. >>>>> >>>>> Regards, >>>>> Go Kudo >>>>> >>>> --00000000000036092205c685f05b--