Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:100239 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 83088 invoked from network); 16 Aug 2017 22:58:00 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 16 Aug 2017 22:58:00 -0000 Authentication-Results: pb1.pair.com smtp.mail=nikita.ppv@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=nikita.ppv@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.223.178 as permitted sender) X-PHP-List-Original-Sender: nikita.ppv@gmail.com X-Host-Fingerprint: 209.85.223.178 mail-io0-f178.google.com Received: from [209.85.223.178] ([209.85.223.178:38678] helo=mail-io0-f178.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 38/99-34801-7FDC4995 for ; Wed, 16 Aug 2017 18:57:59 -0400 Received: by mail-io0-f178.google.com with SMTP id g71so17932013ioe.5 for ; Wed, 16 Aug 2017 15:57:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=ni8aL4ZbtGdbqK8W4ZcOw5HBvw2gaB+ky7bmKsK/9rM=; b=IQ8Gbrps1s+0Sm2oc27/sblN+1R0/1MRQgETtG8fKMmeQoaRriWG4F7Wqsrb2t9WvI s2VpE69luSnYHm9gQU8g4s15M6HKLSS8lTtTtM+aov7Qj8E4fagO2PXljZonp4kkVb8t vD81eLyc8MCU1o34SYO0Hu16094cUOGyhAF2Mx51TT35066vWhjcMOajf+jYC13b5bEm 9yEFkYgowb1BLrSOnmszOVyDSJFtx+3rnlkgt0tql0O+dz47PnwfQHKIxCbRao0Z0s1c zFB/0o0g0IKOrbJa6BEoAfUbcRxINb6U9iyt86wsAwRPs3WQESwJ0clVH7dMFnYjHLSZ hjAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ni8aL4ZbtGdbqK8W4ZcOw5HBvw2gaB+ky7bmKsK/9rM=; b=QpGhXcya9RxU52oUWJBCtfbvJMsz1BELygod8W6bLZXmCvLcAAttu0sTIomcful3Cd OYDyolHXhk0eLGAfoX/vd5jpiK7xLaQZUiXGQv03tW/tK9ubYCDYNJg/Ete1cEycN37c beVKFzhz9itkMT2S2kLrCwd0C0PYvmFZIPzIl8HesBTmZkmeE6biXgSRwHXDo25WWoo7 vHH5ZJ3ln8ksArjOBARQGzkTCNmXXk9ARKlrIn2AFPirWAkt1yTo01fRkiXgazQqn4bl kFT7+JG+uJVloTFcliV5Bg/igadJIkFtI+SKhu8rM7XVfdcn2Ev/nxZuPo/2rmN7xrwz 6TvQ== X-Gm-Message-State: AHYfb5jhQ3Yg25/orFgeE+H+wrBLQ+zlXBecBT0PY1TCAJRQdC/5og/5 6HlHa4Yhj2ml96o8OU6jHaECSP710w== X-Received: by 10.107.166.203 with SMTP id p194mr3076270ioe.48.1502924277113; Wed, 16 Aug 2017 15:57:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.13.3 with HTTP; Wed, 16 Aug 2017 15:57:56 -0700 (PDT) In-Reply-To: <20170816220242.GA13012@openwall.com> References: <20170816191247.GA12324@openwall.com> <20170816214155.GA12831@openwall.com> <20170816220242.GA13012@openwall.com> Date: Thu, 17 Aug 2017 00:57:56 +0200 Message-ID: To: Solar Designer Cc: Leigh , PHP internals Content-Type: multipart/alternative; boundary="001a11414af4a967550556e6d3e8" Subject: Re: [PHP-DEV] PHP 7.1.0 to 7.2.0beta2 mt_rand() modulo bias bug From: nikita.ppv@gmail.com (Nikita Popov) --001a11414af4a967550556e6d3e8 Content-Type: text/plain; charset="UTF-8" On Thu, Aug 17, 2017 at 12:02 AM, Solar Designer wrote: > On Wed, Aug 16, 2017 at 11:41:55PM +0200, Solar Designer wrote: > > On Wed, Aug 16, 2017 at 10:06:02PM +0200, Nikita Popov wrote: > > > I'd suggest to split the 32-bit and 64-bit code codepaths entirely, as > the > > > interleaved #ifs are somewhat hard to follow. Something like > > > https://gist.github.com/nikic/64e7ec58ebb6121d350fb80927a65082 (not > > > thoroughly tested). > > > > This looks good to me - especially how you reduced the nesting of if's > > by special-casing the "Powers of two are not biased" return. With this > > change, you can as well drop the "Special case where no modulus is > > required", as it'd happen to be handled the same by your new return. > > OTOH, that optimization might require an extra comment on its own. > > > > Here's what this might look like (totally untested): > > > > https://gist.github.com/solardiz/5e3d313bbee2c1ce6e200e433b750bef > > One difference I didn't notice at first: the currently committed code > does only one php_mt_rand() call per loop iteration when it's skipping > "numbers over the limit", even in the 64-bit case (thus, it replaces 32 > bits of the value at a time). Your 64-bit version (and my revision of > it) does two calls per iteration (replacing the whole number). > > Arguably, the currently committed code is smarter and more efficient in > that respect, whereas yours is cleaner. Regardless, it's an extra > change that will affect the generated random numbers in some cases where > we could avoid making that change (it's not fixing any bug). > > I think it's preferable not to unnecessarily change what numbers are > generated. > > Alexander > Good point. However, I think that this optimization is not actually correct. For example, let's take umax = 0x1_0000_0001, which is the smallest value for which this codepath is taken. In this case limit would be 0xffff_ffff_ffff_fffe, so we only resample if the value is exactly 0xffff_ffff_ffff_ffff. So if the resampling codepath is taken in this case, and we only generate one new 32-bit value, the top word of the result will be fixed to 0xffff_ffff. (A very small bias in this case, but there's probably more significant cases.) Nikita --001a11414af4a967550556e6d3e8--