Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:115275 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 18354 invoked from network); 2 Jul 2021 13:37:31 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 2 Jul 2021 13:37:31 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 0847A1804CC for ; Fri, 2 Jul 2021 06:58:17 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 2 Jul 2021 06:58:16 -0700 (PDT) Received: by mail-ed1-f52.google.com with SMTP id s15so13306907edt.13 for ; Fri, 02 Jul 2021 06:58:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=uWV1bd4VMYprqDCinklOgjCGyeBLIGWDj2Otcrt4tvI=; b=jun6UsoyX5vsmQzK0IBDpJnahRnEXQ2CGb3nvy+YnnjPGbnbtzmUQuDWFtG0aVfhpU BzTkT6yiqt5NpnXYogJ5sFj2nH2D/riYNwP4J3FF7OPoSYu+mskB19SAlmnugricLVwS H8uX7Bxr0DPOT3ZuZU68w24Y5OvELioFW0LuG2haUaPo68Zn7hk4FPuy5IiSt0+j1/TT IuQkK+lJiCe+cfbytzqPGPUUi2gPv/A/YmdhSV0p6ur2p8G4mpECij3dzBUDumsDPVjh lq4jmAzeTunS8KFFjUeIuYCd0Z0yoNNI5tHLn0QA8wE0GHfhwOJ/Ng80Z1gPRIc0yacP huZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=uWV1bd4VMYprqDCinklOgjCGyeBLIGWDj2Otcrt4tvI=; b=DY86ae3Q5/KszhGum0fhGxv+i8D5vVdSQiD/KwogrWTMcBzVNrhrylqwT1uubmzGC9 QQJ5078b4oDN/b1lTEVKuENS3KD87RC4mTHASRDlzekKQEYqjfVFf7EMER5pGAPbtDy+ i5mzdvueeUapUQsMnNYdW/k2bPNSjhbAXWbZS6mI6ElLAIpp9RNCBMKaJKJclEMU25X7 6qPisJ6BwLBaMweown1tim4TIFyRyNe69pEIdKZqB1gUuQki3Q2MbMoMZDbfIz1ydZXI YX7uJt2bRs+TdRRHBguObKJyOmdOpjFs6EE+6qwFhJ3Fnr85/drpd8bUk0fN8WM04sjZ 17/w== X-Gm-Message-State: AOAM531oWs2z7CGfn0pj1OaZ4E4JtzFnd6Ffk9Vh9kfi2vuXYqbuZzkK gQo9fo6AfWoaab2Paukk0QtUFO5JUHmEfKfxY1o= X-Google-Smtp-Source: ABdhPJxCLh2b30Hyos8Qt1CgAvs1/2RJUqOZVf6ktWEN/u7gOKzh9dxt1snSTTMaMhSrrwSqO2MxRSozCAmcGAK6+Hg= X-Received: by 2002:a05:6402:214:: with SMTP id t20mr7171030edv.20.1625234294244; Fri, 02 Jul 2021 06:58:14 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: Date: Fri, 2 Jul 2021 22:58:03 +0900 Message-ID: To: Nikita Popov , PHP internals Content-Type: multipart/alternative; boundary="000000000000c9083105c62458ff" Subject: Re: [PHP-DEV] [RFC] Add Random Extension (before: Add Random class) From: zeriyoshi@gmail.com (Go Kudo) --000000000000c9083105c62458ff Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable > * The first bit is just clarification. After a cursory look at the implementation, my understanding is that the getInt(), shuffleArray() and shuffleString() APIs will always produce consistent results on 32-bit and 64-bit, as long as your inputs are 32-bit as well (i.e., min/max are 32-bit and string is smaller than 4G). Is that correct? The only APIs that would exhibit different behavior are nextInt() and getBytes(), right? Yes. I do not want to break the compatibility of the implementation. I would prefer to be able to migrate code that uses the current internal state. > * Looking at the implementation, nextInt() performs a >> 1 operation on the RNG result. I assume the motivation is to get back a non-negative number. But why do we want that? The "nextInt()" name doesn't really indicate that it's a positive number. I think more generally, my question here may be "Why does this method exist at all? When would you use it instead of getInt()?" This was to allow for future forward compatibility. When PHP_INT_SIZE exceeds 8, the result will be incompatible without bit shifting. This is similar to the way mt_rand() does bit shifting now. However, I can agree that such a day will never come in reality. And as the comments on GitHub show, there are ways to keep the values compatible even if such a time comes. After thinking about it for a while, I finally came to the conclusion that there is no benefit to this other than to make mt_rand() and Random\NumberGenerator\MT19937 directly compatible. If compatibility is needed, it can be achieved by bit shifting in the PHP code, so direct compatibility is probably unnecessary. I will change the implementation and remove this option. > "Why does this method exist at all? When would you use it instead of getInt()?" The case for this would be if you want to get a raw unrounded random number sequence as a number. The situations where this is required would certainly be limited, but it would be nice to have. (At least, I know of several production codes that use the result of mt_rand() with no arguments.) > * I don't really get why we need RandomInterface. I think if the choice is between "final + interface" and "non-final without interface", I'd prefer the latter (though I'm also happy with "final without interface"). I had completely lost my train of thought on this. The interface makes the Random class unextensible. I have removed this. > I'm not entirely happy with the naming. Unfortunately, I don't have great suggestions either. I think in your current hierarchy, I would make the interface Random\NumberGenerator (with implementation in the sub-namespace), rather than Random\NumberGenerator\RandomNumberGenerator. Deep-rooted problem. For now, I'm going to change RandomNumberGenerator to Random\NumberGenerator. It's the best one so far. I continue to be plagued by Valgrind warnings and crashes of Windows ZTS builds... I'd like to make a voting phase that is fixed ... Regards, Go Kudo 2021=E5=B9=B46=E6=9C=8829=E6=97=A5(=E7=81=AB) 23:01 Nikita Popov : > On Sat, Jun 26, 2021 at 2:40 AM Go Kudo wrote: > >> Hello Internals. >> >> RFC has been reorganized for finalization. >> >> https://wiki.php.net/rfc/rng_extension >> >> The changes from the previous version are as follows: >> >> - Changed again to a class-based approach. The argument can be omitted, = in >> which case an instance of XorShift128Plus will be created automatically. >> - Future scope was specified in the RFC and the functionality was >> separated >> as a Random extension. >> - Changed to separate it as a Random extension and use the appropriate >> namespace. >> - In order to extend the versatility of the final class, Random, a >> RandomInterface has been added, similar in approach to the >> DateTimeInterface. >> > > The updated proposal looks quite nice :) I think this is close to done. > Some small bits of feedback: > > * The first bit is just clarification. After a cursory look at the > implementation, my understanding is that the getInt(), shuffleArray() and > shuffleString() APIs will always produce consistent results on 32-bit and > 64-bit, as long as your inputs are 32-bit as well (i.e., min/max are 32-b= it > and string is smaller than 4G). Is that correct? The only APIs that would > exhibit different behavior are nextInt() and getBytes(), right? > * Looking at the implementation, nextInt() performs a >> 1 operation on > the RNG result. I assume the motivation is to get back a non-negative > number. But why do we want that? The "nextInt()" name doesn't really > indicate that it's a positive number. I think more generally, my question > here may be "Why does this method exist at all? When would you use it > instead of getInt()?" > * Another bit of clarification: For the user-defined RNG, which range is > generate() expected to return? I assume that it must return the native > integer size, i.e. 32-bit on 32-bit and 64-bit on 64-bit? > * I don't really get why we need RandomInterface. I think if the choice > is between "final + interface" and "non-final without interface", I'd > prefer the latter (though I'm also happy with "final without interface"). > * I'm not entirely happy with the naming. Unfortunately, I don't have > great suggestions either. I think in your current hierarchy, I would make > the interface Random\NumberGenerator (with implementation in the > sub-namespace), rather than Random\NumberGenerator\RandomNumberGenerator. > > Regards, > Nikita > > I've done a tidy implementation to make this final, but I'm currently >> suffering from error detection by Valgrind for unknown reasons. >> >> Implementation is here: https://github.com/php/php-src/pull/7079 >> >> This can be reproduced with the following code. >> >> ```sh >> # Success >> $ valgrind ./sapi/cli/php -r '$random =3D new Random(); $random->nextInt= ();' >> =3D=3D95522=3D=3D Memcheck, a memory error detector >> =3D=3D95522=3D=3D Copyright (C) 2002-2017, and GNU GPL'd, by Julian Sewa= rd et al. >> =3D=3D95522=3D=3D Using Valgrind-3.14.0 and LibVEX; rerun with -h for co= pyright >> info >> =3D=3D95522=3D=3D Command: ./sapi/cli/php -r $random\ =3D\ new\ Random()= ;\ >> $random-\>nextInt(); >> =3D=3D95522=3D=3D >> =3D=3D95522=3D=3D >> =3D=3D95522=3D=3D HEAP SUMMARY: >> =3D=3D95522=3D=3D in use at exit: 1,286 bytes in 32 blocks >> =3D=3D95522=3D=3D total heap usage: 28,445 allocs, 28,413 frees, 4,333= ,047 bytes >> allocated >> =3D=3D95522=3D=3D >> =3D=3D95522=3D=3D LEAK SUMMARY: >> =3D=3D95522=3D=3D definitely lost: 0 bytes in 0 blocks >> =3D=3D95522=3D=3D indirectly lost: 0 bytes in 0 blocks >> =3D=3D95522=3D=3D possibly lost: 0 bytes in 0 blocks >> =3D=3D95522=3D=3D still reachable: 1,286 bytes in 32 blocks >> =3D=3D95522=3D=3D suppressed: 0 bytes in 0 blocks >> =3D=3D95522=3D=3D Rerun with --leak-check=3Dfull to see details of leake= d memory >> =3D=3D95522=3D=3D >> =3D=3D95522=3D=3D For counts of detected and suppressed errors, rerun wi= th: -v >> =3D=3D95522=3D=3D ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0= from 0) >> >> # Fail >> $ valgrind ./sapi/cli/php -r '$random =3D new Random(); $random->nextInt= () >> =3D=3D=3D $random->nextInt();' >> =3D=3D95395=3D=3D Memcheck, a memory error detector >> =3D=3D95395=3D=3D Copyright (C) 2002-2017, and GNU GPL'd, by Julian Sewa= rd et al. >> =3D=3D95395=3D=3D Using Valgrind-3.14.0 and LibVEX; rerun with -h for co= pyright >> info >> =3D=3D95395=3D=3D Command: ./sapi/cli/php -r $random\ =3D\ new\ Random()= ;\ >> $random-\>nextInt()\ =3D=3D=3D\ $random-\>nextInt(); >> =3D=3D95395=3D=3D >> =3D=3D95395=3D=3D Conditional jump or move depends on uninitialised valu= e(s) >> =3D=3D95395=3D=3D at 0x966925: ZEND_IS_IDENTICAL_SPEC_VAR_VAR_HANDLER >> (zend_vm_execute.h:27024) >> =3D=3D95395=3D=3D by 0x99AC27: execute_ex (zend_vm_execute.h:57236) >> =3D=3D95395=3D=3D by 0x99C902: zend_execute (zend_vm_execute.h:59026) >> =3D=3D95395=3D=3D by 0x8DB6B4: zend_eval_stringl (zend_execute_API.c:= 1191) >> =3D=3D95395=3D=3D by 0x8DB861: zend_eval_stringl_ex (zend_execute_API= .c:1233) >> =3D=3D95395=3D=3D by 0x8DB8D6: zend_eval_string_ex (zend_execute_API.= c:1243) >> =3D=3D95395=3D=3D by 0xA4DAE4: do_cli (php_cli.c:995) >> =3D=3D95395=3D=3D by 0xA4E8E2: main (php_cli.c:1366) >> =3D=3D95395=3D=3D >> =3D=3D95395=3D=3D >> =3D=3D95395=3D=3D HEAP SUMMARY: >> =3D=3D95395=3D=3D in use at exit: 1,286 bytes in 32 blocks >> =3D=3D95395=3D=3D total heap usage: 28,445 allocs, 28,413 frees, 4,333= ,070 bytes >> allocated >> =3D=3D95395=3D=3D >> =3D=3D95395=3D=3D LEAK SUMMARY: >> =3D=3D95395=3D=3D definitely lost: 0 bytes in 0 blocks >> =3D=3D95395=3D=3D indirectly lost: 0 bytes in 0 blocks >> =3D=3D95395=3D=3D possibly lost: 0 bytes in 0 blocks >> =3D=3D95395=3D=3D still reachable: 1,286 bytes in 32 blocks >> =3D=3D95395=3D=3D suppressed: 0 bytes in 0 blocks >> =3D=3D95395=3D=3D Rerun with --leak-check=3Dfull to see details of leake= d memory >> =3D=3D95395=3D=3D >> =3D=3D95395=3D=3D For counts of detected and suppressed errors, rerun wi= th: -v >> =3D=3D95395=3D=3D Use --track-origins=3Dyes to see where uninitialised v= alues come >> from >> =3D=3D95395=3D=3D ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0= from 0) >> ``` >> >> However, the detection is internal to the Zend VM and the cause has not >> been identified. From the code, it looks like memory management is being >> done properly. >> >> I have a somewhat tricky way of allocating memory to make the process >> common, do I need to give some hints to Valgrind? >> >> If you know, I would appreciate your advice on this issue. >> >> Regards, >> Go Kudo >> > --000000000000c9083105c62458ff--