Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:115329 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 76824 invoked from network); 6 Jul 2021 14:32:38 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 6 Jul 2021 14:32:38 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id C84EF1804C8 for ; Tue, 6 Jul 2021 07:54:24 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 6 Jul 2021 07:54:24 -0700 (PDT) Received: by mail-lf1-f53.google.com with SMTP id a18so38748451lfs.10 for ; Tue, 06 Jul 2021 07:54:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=fGouyqhinHDB7z8ED5VaDMVbe9KOFB4hnJYbg9kEiyQ=; b=lpWYyI2H6hAJwcECQilUgkFA2VmJHYZi2blvysdjWXx5g1mvlrNx5LLT1oYEQj1g9B L9JXsnae7WnavcOZLaYMyw1zj6tijFX8QapEolFUF56nwMZAHhw9D+dA+w+YvE2enfdW wFfWk1c6X4JIicmtIU/Rp3nO8SsVofClkU2rAB64KevAYlAqKHD34R1FFQfVwZcWY2bk XeBK2YaA+ISnwtZUEIshOhB+idpqXJF4opeGK14lc5AdNq0TE/phb9aM1bZowibjOt1O +A9DwqoA79mKLqlIA1pKjBzoYKcZVtW182+D/H7LrVfoth8h1aVvmfV5qnktS7NDjMFC bXkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=fGouyqhinHDB7z8ED5VaDMVbe9KOFB4hnJYbg9kEiyQ=; b=LuzPjjtMVJlCsYPbMnXHd66qr8fVlaS8Qz1FH9SpldOpdYOXdVQLXIifUT4lPhPViz 14AkWZTRe27t3242naKh5ZoiNcy1cZ4Kn3TeW+Ju5A+KDLhGTVYwhB+YRdImcVLNaK2T eNoXyuliyaB2PzjN9n/HNFYOggZoM8F9aLb21VW25ZVlKeVdguJt6DQV1iXrfg3xnsyP 4+CXVKtb9bA47vHvdDGvzSJt7FEN3P1sFK8CEMpHbz/8MokCTXSNZwW3gUXZzRsI8Zbh sHfoiRQ93j+Moofo2gy3VcwwwsVJy8Vq+aC+pi5JaxL/EHUhmj2LB5N/myc+rTcy3SUE mUKA== X-Gm-Message-State: AOAM531Pxczq+cZo1yxnK6rdcGZjiS7kFzLuOGUsCqLCoWz8OW952a/O GgVPhbVAridyHUXnYI4Hpa2P/OzBggSn5bGKif4= X-Google-Smtp-Source: ABdhPJxOng0PrK3A9YXXYcW/7Hug5oX902C7mIQ1hTnJsTrTfdRcVR/EylQ7YOX0Lm0xxjRLRI/hq4UQ/OzFVc6CGr0= X-Received: by 2002:a19:ad04:: with SMTP id t4mr15800182lfc.385.1625583260480; Tue, 06 Jul 2021 07:54:20 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Tue, 6 Jul 2021 23:54:09 +0900 Message-ID: To: Nikita Popov , PHP internals Content-Type: multipart/alternative; boundary="000000000000cb3baf05c6759875" Subject: Re: [PHP-DEV] [RFC] Add Random Extension (before: Add Random class) From: zeriyoshi@gmail.com (Go Kudo) --000000000000cb3baf05c6759875 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > Could you share some example of where you use it? It looks like mt_rand() could be replaced by mt_rand(0, getrandmax()), but that is not the case, mt_rand() with a specified range is an implementation that generates random numbers until the desired value is obtained, which may unintentionally advance the random state. This can be undesirable for pseudo-random numbers with periodicity. Also, shouldn't compatibility with mt_rand() be maintained? Currently, NumberGenerator\MT19937 is fully compatible with mt_rand(), do We need to drop it? > $random->getNumberGenerator()->generate() to access the raw RNG stream. Indeed. I think nextInt() can be removed from Random, since it currently returns exactly the same value. (However, I think NumberGenerator::generate() should be kept). Regards, Go Kudo 2021=E5=B9=B47=E6=9C=886=E6=97=A5(=E7=81=AB) 23:35 Nikita Popov : > On Fri, Jul 2, 2021 at 3:58 PM Go Kudo wrote: > >> > * The first bit is just clarification. After a cursory look at the >> implementation, my understanding is that the getInt(), shuffleArray() an= d >> shuffleString() APIs will always produce consistent results on 32-bit an= d >> 64-bit, as long as your inputs are 32-bit as well (i.e., min/max are 32-= bit >> and string is smaller than 4G). Is that correct? The only APIs that woul= d >> exhibit different behavior are nextInt() and getBytes(), right? >> >> Yes. I do not want to break the compatibility of the implementation. I >> would prefer to be able to migrate code that uses the current internal >> state. >> >> > * Looking at the implementation, nextInt() performs a >> 1 operation >> on the RNG result. I assume the motivation is to get back a non-negative >> number. But why do we want that? The "nextInt()" name doesn't really >> indicate that it's a positive number. I think more generally, my questio= n >> here may be "Why does this method exist at all? When would you use it >> instead of getInt()?" >> >> This was to allow for future forward compatibility. When PHP_INT_SIZE >> exceeds 8, the result will be incompatible without bit shifting. This is >> similar to the way mt_rand() does bit shifting now. >> >> However, I can agree that such a day will never come in reality. And as >> the comments on GitHub show, there are ways to keep the values compatibl= e >> even if such a time comes. >> >> After thinking about it for a while, I finally came to the conclusion >> that there is no benefit to this other than to make mt_rand() and >> Random\NumberGenerator\MT19937 directly compatible. >> If compatibility is needed, it can be achieved by bit shifting in the PH= P >> code, so direct compatibility is probably unnecessary. I will change the >> implementation and remove this option. >> >> > "Why does this method exist at all? When would you use it instead of >> getInt()?" >> >> The case for this would be if you want to get a raw unrounded random >> number sequence as a number. The situations where this is required would >> certainly be limited, but it would be nice to have. (At least, I know of >> several production codes that use the result of mt_rand() with no >> arguments.) >> > > Could you share some example of where you use it? Maybe that will help > understand the motivation for it. > > Also, I think it's worth pointing out that it's always possible to use > $random->getNumberGenerator()->generate() to access the raw RNG stream. > > Regards, > Nikita > > > * I don't really get why we need RandomInterface. I think if the choic= e >> is between "final + interface" and "non-final without interface", I'd >> prefer the latter (though I'm also happy with "final without interface")= . >> >> I had completely lost my train of thought on this. The interface makes >> the Random class unextensible. I have removed this. >> >> > I'm not entirely happy with the naming. Unfortunately, I don't have >> great suggestions either. I think in your current hierarchy, I would mak= e >> the interface Random\NumberGenerator (with implementation in the >> sub-namespace), rather than Random\NumberGenerator\RandomNumberGenerator= . >> >> Deep-rooted problem. For now, I'm going to change RandomNumberGenerator >> to Random\NumberGenerator. It's the best one so far. >> >> >> I continue to be plagued by Valgrind warnings and crashes of Windows ZTS >> builds... >> I'd like to make a voting phase that is fixed ... >> >> Regards, >> Go Kudo >> >> 2021=E5=B9=B46=E6=9C=8829=E6=97=A5(=E7=81=AB) 23:01 Nikita Popov : >> >>> On Sat, Jun 26, 2021 at 2:40 AM Go Kudo wrote: >>> >>>> Hello Internals. >>>> >>>> RFC has been reorganized for finalization. >>>> >>>> https://wiki.php.net/rfc/rng_extension >>>> >>>> The changes from the previous version are as follows: >>>> >>>> - Changed again to a class-based approach. The argument can be omitted= , >>>> in >>>> which case an instance of XorShift128Plus will be created automaticall= y. >>>> - Future scope was specified in the RFC and the functionality was >>>> separated >>>> as a Random extension. >>>> - Changed to separate it as a Random extension and use the appropriate >>>> namespace. >>>> - In order to extend the versatility of the final class, Random, a >>>> RandomInterface has been added, similar in approach to the >>>> DateTimeInterface. >>>> >>> >>> The updated proposal looks quite nice :) I think this is close to done. >>> Some small bits of feedback: >>> >>> * The first bit is just clarification. After a cursory look at the >>> implementation, my understanding is that the getInt(), shuffleArray() a= nd >>> shuffleString() APIs will always produce consistent results on 32-bit a= nd >>> 64-bit, as long as your inputs are 32-bit as well (i.e., min/max are 32= -bit >>> and string is smaller than 4G). Is that correct? The only APIs that wou= ld >>> exhibit different behavior are nextInt() and getBytes(), right? >>> * Looking at the implementation, nextInt() performs a >> 1 operation o= n >>> the RNG result. I assume the motivation is to get back a non-negative >>> number. But why do we want that? The "nextInt()" name doesn't really >>> indicate that it's a positive number. I think more generally, my questi= on >>> here may be "Why does this method exist at all? When would you use it >>> instead of getInt()?" >>> * Another bit of clarification: For the user-defined RNG, which range >>> is generate() expected to return? I assume that it must return the nati= ve >>> integer size, i.e. 32-bit on 32-bit and 64-bit on 64-bit? >>> * I don't really get why we need RandomInterface. I think if the choic= e >>> is between "final + interface" and "non-final without interface", I'd >>> prefer the latter (though I'm also happy with "final without interface"= ). >>> * I'm not entirely happy with the naming. Unfortunately, I don't have >>> great suggestions either. I think in your current hierarchy, I would ma= ke >>> the interface Random\NumberGenerator (with implementation in the >>> sub-namespace), rather than Random\NumberGenerator\RandomNumberGenerato= r. >>> >>> Regards, >>> Nikita >>> >>> I've done a tidy implementation to make this final, but I'm currently >>>> suffering from error detection by Valgrind for unknown reasons. >>>> >>>> Implementation is here: https://github.com/php/php-src/pull/7079 >>>> >>>> This can be reproduced with the following code. >>>> >>>> ```sh >>>> # Success >>>> $ valgrind ./sapi/cli/php -r '$random =3D new Random(); >>>> $random->nextInt();' >>>> =3D=3D95522=3D=3D Memcheck, a memory error detector >>>> =3D=3D95522=3D=3D Copyright (C) 2002-2017, and GNU GPL'd, by Julian Se= ward et >>>> al. >>>> =3D=3D95522=3D=3D Using Valgrind-3.14.0 and LibVEX; rerun with -h for = copyright >>>> info >>>> =3D=3D95522=3D=3D Command: ./sapi/cli/php -r $random\ =3D\ new\ Random= ();\ >>>> $random-\>nextInt(); >>>> =3D=3D95522=3D=3D >>>> =3D=3D95522=3D=3D >>>> =3D=3D95522=3D=3D HEAP SUMMARY: >>>> =3D=3D95522=3D=3D in use at exit: 1,286 bytes in 32 blocks >>>> =3D=3D95522=3D=3D total heap usage: 28,445 allocs, 28,413 frees, 4,3= 33,047 >>>> bytes >>>> allocated >>>> =3D=3D95522=3D=3D >>>> =3D=3D95522=3D=3D LEAK SUMMARY: >>>> =3D=3D95522=3D=3D definitely lost: 0 bytes in 0 blocks >>>> =3D=3D95522=3D=3D indirectly lost: 0 bytes in 0 blocks >>>> =3D=3D95522=3D=3D possibly lost: 0 bytes in 0 blocks >>>> =3D=3D95522=3D=3D still reachable: 1,286 bytes in 32 blocks >>>> =3D=3D95522=3D=3D suppressed: 0 bytes in 0 blocks >>>> =3D=3D95522=3D=3D Rerun with --leak-check=3Dfull to see details of lea= ked memory >>>> =3D=3D95522=3D=3D >>>> =3D=3D95522=3D=3D For counts of detected and suppressed errors, rerun = with: -v >>>> =3D=3D95522=3D=3D ERROR SUMMARY: 0 errors from 0 contexts (suppressed:= 0 from 0) >>>> >>>> # Fail >>>> $ valgrind ./sapi/cli/php -r '$random =3D new Random(); $random->nextI= nt() >>>> =3D=3D=3D $random->nextInt();' >>>> =3D=3D95395=3D=3D Memcheck, a memory error detector >>>> =3D=3D95395=3D=3D Copyright (C) 2002-2017, and GNU GPL'd, by Julian Se= ward et >>>> al. >>>> =3D=3D95395=3D=3D Using Valgrind-3.14.0 and LibVEX; rerun with -h for = copyright >>>> info >>>> =3D=3D95395=3D=3D Command: ./sapi/cli/php -r $random\ =3D\ new\ Random= ();\ >>>> $random-\>nextInt()\ =3D=3D=3D\ $random-\>nextInt(); >>>> =3D=3D95395=3D=3D >>>> =3D=3D95395=3D=3D Conditional jump or move depends on uninitialised va= lue(s) >>>> =3D=3D95395=3D=3D at 0x966925: ZEND_IS_IDENTICAL_SPEC_VAR_VAR_HANDL= ER >>>> (zend_vm_execute.h:27024) >>>> =3D=3D95395=3D=3D by 0x99AC27: execute_ex (zend_vm_execute.h:57236) >>>> =3D=3D95395=3D=3D by 0x99C902: zend_execute (zend_vm_execute.h:5902= 6) >>>> =3D=3D95395=3D=3D by 0x8DB6B4: zend_eval_stringl (zend_execute_API.= c:1191) >>>> =3D=3D95395=3D=3D by 0x8DB861: zend_eval_stringl_ex (zend_execute_A= PI.c:1233) >>>> =3D=3D95395=3D=3D by 0x8DB8D6: zend_eval_string_ex (zend_execute_AP= I.c:1243) >>>> =3D=3D95395=3D=3D by 0xA4DAE4: do_cli (php_cli.c:995) >>>> =3D=3D95395=3D=3D by 0xA4E8E2: main (php_cli.c:1366) >>>> =3D=3D95395=3D=3D >>>> =3D=3D95395=3D=3D >>>> =3D=3D95395=3D=3D HEAP SUMMARY: >>>> =3D=3D95395=3D=3D in use at exit: 1,286 bytes in 32 blocks >>>> =3D=3D95395=3D=3D total heap usage: 28,445 allocs, 28,413 frees, 4,3= 33,070 >>>> bytes >>>> allocated >>>> =3D=3D95395=3D=3D >>>> =3D=3D95395=3D=3D LEAK SUMMARY: >>>> =3D=3D95395=3D=3D definitely lost: 0 bytes in 0 blocks >>>> =3D=3D95395=3D=3D indirectly lost: 0 bytes in 0 blocks >>>> =3D=3D95395=3D=3D possibly lost: 0 bytes in 0 blocks >>>> =3D=3D95395=3D=3D still reachable: 1,286 bytes in 32 blocks >>>> =3D=3D95395=3D=3D suppressed: 0 bytes in 0 blocks >>>> =3D=3D95395=3D=3D Rerun with --leak-check=3Dfull to see details of lea= ked memory >>>> =3D=3D95395=3D=3D >>>> =3D=3D95395=3D=3D For counts of detected and suppressed errors, rerun = with: -v >>>> =3D=3D95395=3D=3D Use --track-origins=3Dyes to see where uninitialised= values come >>>> from >>>> =3D=3D95395=3D=3D ERROR SUMMARY: 1 errors from 1 contexts (suppressed:= 0 from 0) >>>> ``` >>>> >>>> However, the detection is internal to the Zend VM and the cause has no= t >>>> been identified. From the code, it looks like memory management is bei= ng >>>> done properly. >>>> >>>> I have a somewhat tricky way of allocating memory to make the process >>>> common, do I need to give some hints to Valgrind? >>>> >>>> If you know, I would appreciate your advice on this issue. >>>> >>>> Regards, >>>> Go Kudo >>>> >>> --000000000000cb3baf05c6759875--