Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:128957 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by lists.php.net (Postfix) with ESMTPS id 5C0721A00BC for ; Fri, 24 Oct 2025 23:51:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1761349911; bh=L4kmuCLZ/j2WCl02WVnBIGPL4yyE7oi52JT1Q3Q17kk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Il0ChQ6ZhtuLMHT2KiZbYrXL6Pvd2z77ZrZP24vRYrKtrpJ3gRzDLTMp/t/ACLKBK M9yUoDo0CIw+Bfjqd0g+r9OpKOpX+xze12cNbvgzmaI6vh2jMTgr+4BXsUlL+8gsPr JQtAV6nNdExP60t+pChZ61QrHIW6X0xRd8KAx+HEoJYqETwBJUqDqhDV66VBDJ7nc7 wILw6cehGhZSS25hKV7mSG9G7dQBG2IZtckbcSO7POJOdSkeqi51/w/4XT7kipsrcc J2BqYYrSMpzjaPyfZDnG+BlwJY9ivDRUtTqzADnBuGHPB46Tjx2epMmnc0j5hcK7RQ O7am5HOyZlaxg== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 0AF4A18007E for ; Fri, 24 Oct 2025 23:51:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-25) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,HTML_MESSAGE, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.1 X-Spam-Virus: No X-Envelope-From: Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 24 Oct 2025 23:51:50 +0000 (UTC) Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-7832691f86cso34568857b3.1 for ; Fri, 24 Oct 2025 16:51:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intuitivetechnology.com; s=google; t=1761349905; x=1761954705; darn=lists.php.net; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=FrNvPwfnxTj1sAMCiExz4PZqyRoeoCKCjCngdu6ut8o=; b=EB5Nqyf/1+B2trS01FkvY3o1nK+10kpBJWq7G5Hh1AvTe5iZ4nROuVaM8DDneA9WlO RLZpqnmXuiYR/sjkKX5ckqqsErOr6fwBMJIrW73q4x9FXv+gVlLJ9mlnwHr9O7rkppS2 ojchrbnACROqnxVuaoqgbtq5gq+6egq9ZeVVxwU8UKDNZ+E/AUQD+dkGuueKy6zJAbM7 m0O9hRSVAgiZ4wy/JhJ59wz4QdlHoh6JXGbChWMZxXimIHQBRFuOYq9GYkvPvLJ9TOMh tCavDAyzgEryFS8jcWJjIOOkOQtSbLh48EXlHcOnoPJ+lZjrIGSHjbu8hkWx/90TDv47 POuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1761349905; x=1761954705; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FrNvPwfnxTj1sAMCiExz4PZqyRoeoCKCjCngdu6ut8o=; b=v2WkPAqXfpq5K8c+41bv5rjvEyO+D39/CVQvhXllmTetrFduKXFzcWxDLSd48SBnjx p7JriFk0TZ1LvkBtSn7P2KkUMuz7ZHbkgsMTwEN4ncYh7LtuQjgobiaVMb2R6vk/LxEL k1lMXydapKbn87IsjhioG9lALawdN8N06pO4XNzf8Gc5R8QBL41SG8jFjSS7Kk2tSE5A GeTCBJeOXkmD31b7X2v6tlkkmvsf6BUngUd/B6nT6GdSsW7LNsucwXtbECAkfMDM4OG3 sBGZMuLxL8IpDbi5eokhCQZ9nIWYValQRoXxtrGmvlnPmFSrmtz0/r+G+7YMUwqkJPvE of3A== X-Gm-Message-State: AOJu0YxrLrCPtn4U7FizAYhMQjkmHBIOzDqGBlazUbUUfVhpIzeDe4UH waobxHXveVvLzGivrPi10aAgq7Sr7f8LHVz4hZLMQumHTySBkJayfnU0WIGPGj7jZlk1FNujxEh 8a82jWrTNAYAO2y0vK4cNxQziE4zKRIVDP+GZXOz2XAb/zW8yR5QaO+0= X-Gm-Gg: ASbGncuLnfg8mzcu7uCANacmB5bc8OpzLbXwlFmCM8xsFxLAoSucEKse/losAQn9t+M U5+rilmI4r02W63UlMqus1lJJ6MyPw0Bq4DB0Gd+i57Tm2e1r52wb/aiTf2xkGfyvM4VZiIW2ec 8fpcp5Uqd4tI4sAWjxDOVRwZzczD4XZ1ik7h7RmppENLepIrny7YVAEfPG43vsOEhYFn891jEyv vXip0GyHp/sTMGbTueNJqwG5Q3uPqZ9f5lxmkImVHgGbUWak5K+1+W1/LJNIw== X-Google-Smtp-Source: AGHT+IH4ihJpeyd+iAsqnuZcODVkHlkrw4Hey1cgGGghVsH7t2v+H+bE+m/HJcsxJ3MZ91lSUzebp5ywQLDBMfvm5Zo= X-Received: by 2002:a05:690c:2c08:b0:784:7fc4:518 with SMTP id 00721157ae682-785e006d9b6mr34224157b3.6.1761349904942; Fri, 24 Oct 2025 16:51:44 -0700 (PDT) Precedence: list list-help: list-unsubscribe: list-post: List-Id: x-ms-reactions: disallow MIME-Version: 1.0 References: <4b605609-47c2-4df5-be71-8b962ad9ff1e@varteg.nz> In-Reply-To: <4b605609-47c2-4df5-be71-8b962ad9ff1e@varteg.nz> Date: Fri, 24 Oct 2025 16:51:34 -0700 X-Gm-Features: AS18NWD8pLPhn_MhPVBEIDf_vB-ytgUZOt1duQfDF2_Xu83qrgQL7BSz4Qud6NU Message-ID: Subject: Re: [PHP-DEV] RFC proposal for adding SORT_STRICT flag to array_unique() To: Weedpacket@varteg.nz Cc: internals@lists.php.net Content-Type: multipart/alternative; boundary="00000000000068d7410641f03b29" From: jmarble@intuitivetechnology.com (Jason Marble) --00000000000068d7410641f03b29 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Correct! Basically: - SORT_STRINGS: reliable and predictable when you understand the value will be converted to a string - SORT_NUMERIC: same but _risky_, you should be certain you're working with numbers - SORT_REGULAR: the sort is unstable and will inevitably cause a bug that no one will understand LOL With the proposed SORT_STRICT, we will get super fast, reliable and predictable deduplication. On Fri, Oct 24, 2025 at 3:16=E2=80=AFPM Morgan wrote= : > On 2025-10-25 08:34, Jason Marble wrote: > > Hello everybody! > > > > > > The potential for a `SORT_NATURAL` flag also came to mind as another > > useful addition, but I believe `SORT_STRICT` is the more critical > > feature to discuss first. > > > > I know I find array_unique generally useless due to its insistence on > stringifying everything for comparison. > > ``` > $uniques =3D []; > foreach($source_array as $a) { > if(!in_array($a, $uniques, true)) { > $uniques[] =3D $a; > } > } > ``` > > I seem to recall part of the issue is that array_unique works by sorting > its elements so that "equal" values are adjacent. I know this would be > done on O(n log(n)) vs. O(n^2) grounds, but that could be addressed at > least in part by a smarter sort criterion that sorts by type/class (in > some arbitrary order) before sorting by value. For uncomparable types > (i.e., instances of most classes) this would be by object ID, because we > don't _actually_ care about ordering. > --00000000000068d7410641f03b29 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Correct! Basically:
- SORT_STRINGS: reliable and p= redictable when you understand the value will be converted to a string
-= SORT_NUMERIC: same but _risky_, you should be certain you're working w= ith numbers
- SORT_REGULAR: the sort is unstable and will inevitably cau= se a bug that no one will understand LOL

With the proposed SO= RT_STRICT, we will get super fast, reliable and predictable deduplication.<= br>

On Fri, Oct 24, 2025 at 3:16=E2=80=AFPM Morga= n <Weedpacket@varteg.nz> = wrote:
On 2025-1= 0-25 08:34, Jason Marble wrote:
> Hello everybody!
>
>
> The potential for a `SORT_NATURAL` flag also came to mind as another <= br> > useful addition, but I believe `SORT_STRICT` is the more critical
> feature to discuss first.
>

I know I find array_unique generally useless due to its insistence on
stringifying everything for comparison.

```
$uniques =3D [];
foreach($source_array as $a) {
=C2=A0 =C2=A0 =C2=A0if(!in_array($a, $uniques, true)) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0$uniques[] =3D $a;
=C2=A0 =C2=A0 =C2=A0}
}
```

I seem to recall part of the issue is that array_unique works by sorting its elements so that "equal" values are adjacent. I know this wou= ld be
done on O(n log(n)) vs. O(n^2) grounds, but that could be addressed at
least in part by a smarter sort criterion that sorts by type/class (in
some arbitrary order) before sorting by value. For uncomparable types
(i.e., instances of most classes) this would be by object ID, because we don't _actually_ care about ordering.
--00000000000068d7410641f03b29--