Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108841 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 86247 invoked from network); 4 Mar 2020 18:23:02 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 4 Mar 2020 18:23:02 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id A14331804D3 for ; Wed, 4 Mar 2020 08:42:29 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Wed, 4 Mar 2020 08:42:29 -0800 (PST) Received: by mail-lj1-f178.google.com with SMTP id q23so2770703ljm.4 for ; Wed, 04 Mar 2020 08:42:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=xrVcXLooTki3DfIOM8O2oYSmYdUqJuURJH4qr0MfND0=; b=aM4hi3XXQPMcx0bKxr1F6enbQ0G4NN9hzcc3nwNwfPWX5nHwLe7/szj5jC6R8XBJAY CP8B9TLbZGAazm1K9qSHx3uCscpuS5HhrYu4N/5Fqs4xCuaKOHNyW2zaiCLtZwxC1nLp zMvRFqZEc56RZx4XL5sKIvEDPbYUcfExmF0Ag/OG+5OYc+awDOaHMtgRgafWEQF8lzDx 6HmwlETGusfrVSDceAs2APoB6avZ3YNcqa+Mx2TTJEPovbyEX+lny7kNyy65yccxJt84 xkamufgXE6LZq8/J/tkL+1Rq7nUZ6zfkngdWXE8+tcUnALhx6XyY0/WqSIyLyNuWWDuq jcHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=xrVcXLooTki3DfIOM8O2oYSmYdUqJuURJH4qr0MfND0=; b=GsQ+Sz5uL/tqxQEU56BKXJfYISadJFEi1r1F5qmn1uigMX8ZZWgaDaA5L7vVa6IeWh yG7RB1148hIOz8O19j9VW7jM1h26mgel9HARxJuKk3i/+yYUQnIsgW4gcDowk92NMUhc WAsjhBgRBQHPrSAqG3fisbJ6BrSG24RnLhJKcLB/YpB5nqJyr3HkH3IFBvdG3tMU3ePr gVT1m4yx4CuE6R27ORvJXznkZIfuqthR2tO+OIgEqVJI2KbbEwEShwsJtJv7Nr9SzVSn SAbA+lv+CmbzBlX74rZgAFMtkFdhjT9Tn2egslkMcrH34RlCLp/mJ12avyT7k//DtW1i vfeA== X-Gm-Message-State: ANhLgQ2MmGbxIqrsGDQOIae0xpNgpZewyBqsrFrzWqDY8gj3WGBgxlwL CQ4DMYBca7n/rfoJMZLAnwAEYYs3HBUbfAaNk8nDrNlXNDk= X-Google-Smtp-Source: ADFU+vvfppW3LpGdKg+N1LGGEN2ybnoFLGNxDBw72I1raMSEyns3ouwvZEuR2RCx4cdWzCK5mfnNwsDLt48G+SWWufQ= X-Received: by 2002:a2e:7806:: with SMTP id t6mr2518478ljc.145.1583340143481; Wed, 04 Mar 2020 08:42:23 -0800 (PST) MIME-Version: 1.0 Date: Wed, 4 Mar 2020 17:42:07 +0100 Message-ID: To: PHP internals Content-Type: multipart/alternative; boundary="000000000000cf953905a00a1a16" Subject: Make sorting stable From: nikita.ppv@gmail.com (Nikita Popov) --000000000000cf953905a00a1a16 Content-Type: text/plain; charset="UTF-8" Hi internals, Sorting function in PHP currently do not guarantee stability, which means that the result order of elements that compare equal is not defined. To achieve a stable sort, you need to do something like this (untested): $arrayAndPos = []; $pos = 0; foreach ($array as $value) { $arrayAndPos[] = [$value, $pos++]; } usort($arrayAndPos, function($a, $b) use($compare) { return $compare($a[0], $b[0]) ?: $a[1] <=> $b[1]; }); $array = []; foreach ($arrayAndPos as $elem) { $array[] = $elem[0]; } This is both cumbersome and very inefficient. Those temporary arrays significantly increase memory usage, and hurt performance of the sort. I believe that it is important for us to provide a stable sorting option in the standard library. We could introduce a separate stable function(s) for this purpose, but I believe we can just as well make all our existing sorts stable. This does not require actually switching sorting algorithms to something like Timsort. We can essentially do the same as the above PHP code, just much more efficiently. Due to certain implementation details, we can do this without memory usage increase, and with minimal performance impact. I've put https://github.com/php/php-src/pull/5236 up as a prototype. The only issue I ran into is that this change has a negative impact on code using illegal comparison callbacks like this: usort($array, function($a, $b) { return $a > $b; }); The return value of the sorting callback should be $a <=> $b, but using $a > $b also works right now, due to implementation details of the sorting implementation (only the outcome of $compare($a, $b) > 0 ends up being used by the sort). This kind of incorrect code will break under the proposed implementation, because we will now compare by original position if the comparison function reports equality. Because the comparator reports equality inconsistently (it says that $a == $b, but $b != $a), the sort results are also inconsistent. What do people think about this? Is there interest in making sorting stable? Is it okay to break code using illegal comparison callbacks? Regards, Nikita --000000000000cf953905a00a1a16--