Hi internals,
This was previously discussed in https://externals.io/message/108841, but I
figured it would make sense to create an RFC for this change:
https://wiki.php.net/rfc/stable_sorting
As before, the implementation approach is to stick with the existing qsort
and use a fallback comparison criterion, which is cheap to implement
internally. The obvious alternative is to use a stable base sort like
Timsort. I gave this a quick try in https://github.com/php/php-src/pull/5559,
and it does better in some cases (already sorted data) and worse in others
(random data). I don't plan to pursue this direction personally. (It may
also be interesting to use pdqsort as the unstable base, which should make
the "already sorted" case faster.)
Regards,
Nikita
Hi Nikita,
Thanks for another nice RFC.
This was previously discussed in https://externals.io/message/108841, but I
figured it would make sense to create an RFC for this change:
https://wiki.php.net/rfc/stable_sortingAs before, the implementation approach is to stick with the existing qsort
and use a fallback comparison criterion, which is cheap to implement
internally. The obvious alternative is to use a stable base sort like
Timsort. I gave this a quick try in
https://github.com/php/php-src/pull/5559,
and it does better in some cases (already sorted data) and worse in others
(random data). I don't plan to pursue this direction personally. (It may
also be interesting to use pdqsort as the unstable base, which should make
the "already sorted" case faster.)
IIRC, in PHP 5 sorting was stable. Then PHP 7 made it unstable. That was
something to account for when migrating - BC break inside.
Do you know why did this happen at the time?
On my side, I would very much prefer a new sorting flag, to be 200% sure of
the behavior:
sort($array, SORT_STABLE)
That would allow ppl that don't care about stability to let the engine go
with the fastest algorithm.
WDYT?
Nicolas
On Tue, May 12, 2020 at 4:29 PM Nicolas Grekas nicolas.grekas+php@gmail.com
wrote:
Hi Nikita,
Thanks for another nice RFC.
This was previously discussed in https://externals.io/message/108841, but
I
figured it would make sense to create an RFC for this change:
https://wiki.php.net/rfc/stable_sortingAs before, the implementation approach is to stick with the existing qsort
and use a fallback comparison criterion, which is cheap to implement
internally. The obvious alternative is to use a stable base sort like
Timsort. I gave this a quick try in
https://github.com/php/php-src/pull/5559,
and it does better in some cases (already sorted data) and worse in others
(random data). I don't plan to pursue this direction personally. (It may
also be interesting to use pdqsort as the unstable base, which should make
the "already sorted" case faster.)IIRC, in PHP 5 sorting was stable. Then PHP 7 made it unstable. That was
something to account for when migrating - BC break inside.
In PHP 5 sorting was unstable. In PHP 7 sorting became stable for arrays
with 16 elements or less. In PHP 8 sorting would become stable for all
arrays.
On my side, I would very much prefer a new sorting flag, to be 200% sure
of the behavior:
sort($array, SORT_STABLE)That would allow ppl that don't care about stability to let the engine go
with the fastest algorithm.
I'd prefer to avoid an extra option. PHP is a high-level language, and the
user should not have to concern themselves with sort stability
considerations. I'd go for a flag if stable sorting had a huge perf impact.
But as it doesn't, I don't see the point in an option.
Regards,
Nikita
Am 12.05.2020 um 16:39 schrieb Nikita Popov:
I'd prefer to avoid an extra option. PHP is a high-level language, and the
user should not have to concern themselves with sort stability
considerations.
+1
Hi internals,
This was previously discussed in https://externals.io/message/108841, but
I figured it would make sense to create an RFC for this change:
https://wiki.php.net/rfc/stable_sortingAs before, the implementation approach is to stick with the existing qsort
and use a fallback comparison criterion, which is cheap to implement
internally. The obvious alternative is to use a stable base sort like
Timsort. I gave this a quick try in
https://github.com/php/php-src/pull/5559, and it does better in some
cases (already sorted data) and worse in others (random data). I don't plan
to pursue this direction personally. (It may also be interesting to use
pdqsort as the unstable base, which should make the "already sorted" case
faster.)
If nothing new comes up, I plan to open voting on this RFC soon.
Nikita