Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108976 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 40127 invoked from network); 11 Mar 2020 13:20:59 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 11 Mar 2020 13:20:59 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id BE4B21804F3 for ; Wed, 11 Mar 2020 04:42:05 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 11 Mar 2020 04:42:05 -0700 (PDT) Received: by mail-lj1-f196.google.com with SMTP id 19so1932610ljj.7 for ; Wed, 11 Mar 2020 04:42:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9YwSQRJX4Kvyj1oEmBrxpK/BDPfmz/e7X5EXEMy8PQk=; b=ve8Fc/TJgh/++8TZ9xHt647EYXEC2kk6WG42hSil6a+8Zau6n29tgTDgVwp+tP+232 Wr2sQM6/Nb/aDwfW4q0xly3Q8Z7E9NnVNawt5ixV02cPP8iE7/ObdOlpcnE7ZbtW316Q GLiBVZ14FSacEfT0QIwtr4xdW3E4cWRRacEqj+I+eIHyG1bv6CxG/iwx+nj/8AM8ZSkx oTUKWjW3c5rsGf+VdOz5QaHGZMd/WX1bi5UWwgdFlQ4qKHFewJM0Y9XycnKgLg8rV6Jh cYl2OTPRVcvEk3vKl1IgF3fE5dHftX5tTD/YVwAYiUhlzJjB0NDJoQhGyDGl1WToejD0 T2Yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9YwSQRJX4Kvyj1oEmBrxpK/BDPfmz/e7X5EXEMy8PQk=; b=bJNGUDf8Giu0/nyiNhUBpQqghZahFORCznt+hKzZaPvoYYZ+4PTNTMAVJiOV6OCNmw kP9wwDCaamle+zE7ocPBNNSpn+/XhviEt8C8cPq0U18ZPREFMBGmjWzC9bP4oKlPOExA z3A0ufxNDsdTlZ0Qu0ZdGoeZrb88r04jmdXpBImrb78GDer7A6NKIAoFOslsZiK57ISa 4DdAAweJTmv7WZBoyamrqzXjW1T9FO1+cJf0PHQDiXg9aBoEKfgJG9zmhx3L12E3mfzK uwBxkTvC4vYvNxN5rmC/yhIc0CtMo2U4KWkyPNN4VD6uUF10xCdYyCnfU/SRQwF17X4Q zjbQ== X-Gm-Message-State: ANhLgQ0j+ZzqlucmecPX8YYx2Up1da8pXXa7sjURMFY9xAm2f3CNqmjI WOL+dXFjtGrWaezqFxjDp5V5OpBFAL8E+2y/MHxPw5VyGPxLlw== X-Google-Smtp-Source: ADFU+vuGBscNmZjb9TnJA3pv+oYIZMKUPZBP4Mwwfxv5USDGiXcJL4FJDEBoFz8tjfeIZJSvp8LxfFaaME9M0eEK/Z8= X-Received: by 2002:a2e:b554:: with SMTP id a20mr1932188ljn.34.1583926923778; Wed, 11 Mar 2020 04:42:03 -0700 (PDT) MIME-Version: 1.0 References: <9a590afe-8674-4d5c-b18f-2bc15786348c@www.fastmail.com> In-Reply-To: <9a590afe-8674-4d5c-b18f-2bc15786348c@www.fastmail.com> Date: Wed, 11 Mar 2020 12:41:46 +0100 Message-ID: To: Larry Garfield Cc: php internals Content-Type: multipart/alternative; boundary="000000000000a4602305a092b922" Subject: Re: [PHP-DEV] Make sorting stable From: nikita.ppv@gmail.com (Nikita Popov) --000000000000a4602305a092b922 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Mar 4, 2020 at 8:49 PM Larry Garfield wrote: > On Wed, Mar 4, 2020, at 12:35 PM, Sara Golemon wrote: > > On Wed, Mar 4, 2020 at 12:12 PM Nikita Popov > wrote: > > > > > On Wed, Mar 4, 2020 at 6:17 PM Sara Golemon wrote: > > > > > >> On Wed, Mar 4, 2020 at 10:42 AM Nikita Popov > > >> wrote: > > >> > > >>> The only issue I ran into is that this change has a negative impact > on > > >>> code > > >>> using illegal comparison callbacks like this: > > >>> > > >>> usort($array, function($a, $b) { > > >>> return $a > $b; > > >>> }); > > >>> > > >>> Let's define what "negative impact" means in this regard. Is it th= at > > >> one still winds up with an essentially sorted array, but hitherto > "stable > > >> appering" output is now stable in a different way? Or is the result > > >> actually just NOT sorted in a way that a reasonable user would > consider > > >> correct (e.g. 5 sorted before "3")? > > >> > > > > > > "Negative impact" is PR speak for "completely broken" ;) Yes, it coul= d > > > sort 5 before 3. The comparison function becomes completely > ill-defined and > > > you get Garbage-In-Garbage-Out. > > > > > > Roger that, thanks for explaining. =F0=9F=91=8D > > > > > > > If we are concerned about this (I'm not sure we should be, but I've a= t > > > least seen people be interested about this peculiar behavior: > > > > https://stackoverflow.com/questions/59908195/how-come-usort-php-works-eve= n-when-not-returning-integers > ), > > > there's two things we can do: > > > > > > 1. As Tyson suggests, we can throw a warning if a boolean is returned > from > > > the comparison callback (probably with a check to only throw it once > per > > > sort), to make it obvious what the cause is. > > > > > > 2. We can fix this transparently by doing something like this > internally: > > > > > > $result =3D $compare($a, $b); > > > if ($result !=3D=3D false) { > > > return (int) $result; // Normal behavior > > > } > > > > > > // Bad comparison function, try the reverse order as well > > > return -$compare($b, $a); > > > > > > > > =C2=BFPor que no los dos? > > > > Fixing broken stuff transparently for a smooth upgrade path PLUS lettin= g > > people know they're doing it wrong seems win-win and low cost. > > > > -Sara > > > I concur. If a comparison function returns a non-legal value: > > 1) Fold it to a legal value. if feasible. > 2) Raise a notice/warning, maybe deprecated? that tells them they're Doin= g > It Wrong(tm) > 3) Sometime in the future turn that notice/warning into a hard error. > > If I'm understanding the definition of stable here, it means that if two > values evaluate to equal they will always end up in the same order in the > output that they were in the input, yes? So (trivial example): > > $a =3D ["3", 2, 3, 5]; > > Sorts to: > > [2, "3", 3, 5]; > > always, whereas right now it may sort to that or to [2, 3, "3", 5], > somewhat randomly. Am I understanding correctly? > That's right. Of course, this is not a particular useful case :) Typically, stable sorting is useful if you are sorting only by some "part" of the value. The most common case is sorting objects by one field: usort($users, function($user1, $user2) { return $user1->age <=3D> $user2->age; }); With stable sorting, this will sort users by age, but otherwise leave the order of the objects alone. For example, if $users was originally sorted by ID, then the objects will now be sorted by age first and ID second. With unstable sorting, the order within one age bracket would be undefined. Another example is asort(), which sorts by value but preserves keys. $array =3D [ 'foo' =3D> 1, 'bar' =3D> 1, 'oof' =3D> 0, 'rab' =3D> 0, ]; asort($array); With stable sorting, this will always result in $array =3D [ 'oof' =3D> 0, 'rab' =3D> 0, 'foo' =3D> 1, 'bar' =3D> 1, ]; while with unstable sorting, the order of keys for a given value is undefined. Because stable sorting is the more useful default choice, most high-level languages nowadays default to stable sorting (Java and Python have been stable for a long time, JavaScript mandates stability since recently -- Ruby is a counter-example that uses unstable sorting.) Low-level languages tend to expose both stable and unstable sorts. Regards, Nikita --000000000000a4602305a092b922--