Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118974 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 61262 invoked from network); 5 Nov 2022 16:00:32 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 5 Nov 2022 16:00:32 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 8557F180041 for ; Sat, 5 Nov 2022 09:00:29 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS24940 176.9.0.0/16 X-Spam-Virus: No X-Envelope-From: Received: from chrono.xqk7.com (chrono.xqk7.com [176.9.45.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Sat, 5 Nov 2022 09:00:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bastelstu.be; s=mail20171119; t=1667664027; bh=Rt9LO5ZvkVmALDY1+m+Qy7xPi3xTrJdEQQHc0Ha5jEk=; h=Date:Subject:To:References:From:In-Reply-To:From; b=UNALSTWJwxSxy8UIsnkpelpLozeXkiZBOTMkJdkuHrZi3kowYTo/s7Z0M/SBBiKgU /Xq1CJhwMbLIoRwrnrIIN/ELForpDoOLHcTvJNTJEKjo6k75f1NmpvKfkfEvl+szds Lf93LUqJNhHwa6OSG/2KIfFqVJmKnCJmFE7bf9O6veSZz7rP9xUjEtZlLZl2ELo7SC YkaDRRO8u7xf4GxUWjPA5nOaaD2uf8m5D+xE5EtlHq90k8mjz471gQl0cOlLNLcy95 t7An3yl/PAWf0MPnV9+HNXZ1WlwANpqs8E9LELrhGlSi6e50V5V1cSGVRNviG3f7PG quc8kaZ5BnkaA== Message-ID: Date: Sat, 5 Nov 2022 17:00:26 +0100 MIME-Version: 1.0 Content-Language: en-US To: Go Kudo , =?UTF-8?Q?Joshua_R=c3=bcsweg?= , PHP internals References: <5ceebae4-a3fb-5d29-cdb7-dceed7b07c78@wcflabs.de> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] RFC [Discussion]: Randomizer Additions From: tim@bastelstu.be (=?UTF-8?Q?Tim_D=c3=bcsterhus?=) Hi On 11/5/22 16:34, Go Kudo wrote: > I am skeptical only about getFloat(). The use cases are limited and seem > somewhat excessive. Do you have examples of how this is supported in other > languages? Yes, unfortunately getFloat() became pretty complex, but that is because "generating random floats correctly" is pretty complicated, due to how floats work. The getFloat() method as proposed implements the γ-section algorithm as published in: Drawing Random Floating-Point Numbers from an Interval. Frédéric Goualard, ACM Trans. Model. Comput. Simul., 32:3, 2022. https://doi.org/10.1145/3503512 This publication is just 7 months old and explains how the implementation in every other programming language is broken in one way or another and proposes the γ-section algorithm as a not-broken algorithm. As floats are not uniformly dense and do not allow representing all values, it is very easy to introduce a bias or generate incorrect values. An example taken from the publication: > php > $r = new Random\Randomizer(); We generate a random float in [0, 1) (allowing 0, but not 1), by dividing a random int between 2^53 - 1 by 2^53. This is effectively what ->nextFloat() does. This creates a uniformly distributed float with as many different values as possible, because a double (the underlying representation) has 53 bits of precision. The nextFloat() method is often the only thing that is available in other languages, e.g. JavaScript with Math.random() [1] > php > $f = $r->getInt(0, (2**53 - 1)) / (2**53); > php > var_dump($f); > float(0.6942225382038698) Now we want to turn this into a random float between [3.5, 4.5) (not allowing 4.5), because that's what we need. It's also the formula given in MDN for JavaScript's Math.random(): > php > $min = 3.5; > php > $max = 4.5; > php > var_dump($min + ($max - $min) * $f); > float(4.19422253820387) The simple formula appears to do the correct thing and it would be correct if floats could represent all value values. But what happens if the random integer is 2^53 - 1 (i.e. the maximum integer we allowed to generate)? > php > $f = (2**53 - 1) / (2**53); > php > var_dump($f); > float(0.9999999999999999) > php > var_dump($min + ($max - $min) * $f); > float(4.5) In this case the result was rounded to 4.5, because the exact result was not representable. Now an invalid value was generated! Likewise if you generate a random float between 0 and 1000 with this method, some values will appear more often than others due to rounding and the changing density of floats for each power of two. With the γ-section algorithm by Prof. Goualard all these issues are eliminated and that's what is used for getFloat(). The getFloat() method supports all 4 possible boundary combinations to ensure that users have a safe solution for all possible use cases, so that they don't need to build an unsafe solution in userland. Best regards Tim Düsterhus [1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random