Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:108854 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 26298 invoked from network); 4 Mar 2020 22:00:58 -0000 Received: from unknown (HELO localhost.localdomain) (76.75.200.58) by pb1.pair.com with SMTP; 4 Mar 2020 22:00:58 -0000 To: internals@lists.php.net References: Date: Wed, 4 Mar 2020 21:20:27 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:52.0) Gecko/20100101 Firefox/52.0 SeaMonkey/2.49.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Posted-By: 46.59.72.204 Subject: Re: Make sorting stable From: ajf@ajf.me (Andrea Faulds) Message-ID: Hi Nikita, Nikita Popov wrote: > I believe that it is important for us to provide a stable sorting option in > the standard library. We could introduce a separate stable function(s) for > this purpose, but I believe we can just as well make all our existing sorts > stable. I agree with having a stable sorting algorithm, because I suspect comparison functions in the wild are not exhaustive[0] enough to provide very stable orderings, so having a well-defined ordering is kinder to users — nobody wants sort() to scramble their array members if they compare equal, right? [0] By “exhaustive” I mean checking _every_ part of each value, not just e.g. $a->name. Also, you might have SELECT'd some items from a database and the DB explicitly (ORDER BY) or implicitly ordered them, but you didn't think to include the thing ordered by in the SELECT'd columns. > The only issue I ran into is that this change has a negative impact on code > using illegal comparison callbacks like this: > > usort($array, function($a, $b) { > return $a > $b; > }); > > The return value of the sorting callback should be $a <=> $b, but using $a >> $b also works right now, due to implementation details of the sorting > implementation (only the outcome of $compare($a, $b) > 0 ends up being used > by the sort). > > This kind of incorrect code will break under the proposed implementation, > because we will now compare by original position if the comparison function > reports equality. Because the comparator reports equality inconsistently > (it says that $a == $b, but $b != $a), the sort results are also > inconsistent. My personal feeling is stable sorting is more important than supporting broken callbacks, but I have no stats to base that on. I think the suggestions in other emails of either trying to “fix” broken algorithms or to warn the user they're broken are good ideas. I might lean towards the latter given that: • If such code was written before PHP 7.0 (IIRC) switched PHP to an unstable sort, the code would have been broken then and hopefully already fixed. • PHP 7.0 introduced <=> to make writing correct comparison callbacks easier. • We help the programmer to do the right thing by pointing it out to them, so they know what to do in future, and they get improved performance (I think?) for it. Thanks, Andrea