Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:118982 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 39302 invoked from network); 7 Nov 2022 22:42:07 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 7 Nov 2022 22:42:07 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 3D3121804FF for ; Mon, 7 Nov 2022 14:42:06 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 7 Nov 2022 14:42:05 -0800 (PST) Received: by mail-pj1-f51.google.com with SMTP id d59-20020a17090a6f4100b00213202d77e1so16201444pjk.2 for ; Mon, 07 Nov 2022 14:42:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tyK/bwvcLC1Hm/3aSw+cN7Wyu1WCbyy/C7ZhNJMGsoU=; b=JRLvdJZvYdeJ1odDETUWVJ69neCWjMJNClOy3mrrsGIyNbxeYpe2YhX9TWNvSkAbSF MfpsB4FedFtYMShsOqrTGWsXSSCYqHCkChMZwvxwZ+laxdE+ukPW8xL5uL2eNYEa5LvY TY8M+vG+RSfPZnfVkKv5etdAoZhogGu/4z/6mcBu90HaFmjcgaQkh+PRa27cURjDy+Tw GRGQ4/fzR00FzfaamcTSWo9vLLR9p0U1Tm4RzAarWkpQqHoOQWo0aGbwmerJJSj3DZ0I rIa2+D4ZioDlhg9ysg/2gV5BHUdI8ax8VrulD5NM436dZm4oc66WmyNgU1Auh7SRgW4C oC7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tyK/bwvcLC1Hm/3aSw+cN7Wyu1WCbyy/C7ZhNJMGsoU=; b=mKzROJmQlir1ST1+7acZBDrdsGhktHATqHFM1V9Te40ugts3hwskqcRdKHpgjXluwW 6LwHe0p6J3dwT2V5cKijlIcQTzAwSjI7a2cs+pai/6L67qT7GJ66gwadptDZXmedRR6H D6s5Ajd4agjspRf2SVZETajMqdCG0Noju/unY/TJ8PVxve4pfOfamvT7ea+NAUPNtY4G 2++SXhONtNm1zkcssrCK05cT3/wezJQpRgF85H4bNYhQgdGqtxOUt2il35rbHdKNH98M yb53332z3xRhu4oFMspYRbOR1pOj8x90gVroSrNxWFqiKblYR09ndRF6HS04qZCKNieV uKkw== X-Gm-Message-State: ACrzQf2t+jPOJsE1ZE5w9waMiOxXxspiWpdOJ/7n7wgsvseD7nNcWOAS SHtYOdInd1Gc1xkdrZ/8GzeT2zwGinUJsK1ZBXB6Szzh5OvdBA== X-Google-Smtp-Source: AMsMyM5p7FUYb7+GUwHoutwtz/JadxYm41gOZcB9l+wR60uo9PGh9pQSceh0oKySf5TGdbFboI8cHYmkujiuln6B4MM= X-Received: by 2002:a17:90b:4f8a:b0:213:48f0:296f with SMTP id qe10-20020a17090b4f8a00b0021348f0296fmr70886935pjb.140.1667860924324; Mon, 07 Nov 2022 14:42:04 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Mon, 7 Nov 2022 23:41:51 +0100 Message-ID: To: PHP internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] ARRAY_UNIQUE_IDENTICAL option From: tovilo.ilija@gmail.com (Ilija Tovilo) To avoid noise I will respond to all e-mails at once. --- Hi someniatko > Perhaps an alternative idea is to provide a default string value for > enums which are not baked, > Nikolas had already brought up this idea earlier. As others have mentioned, this opens the door for type coercion issues and only really works for string enums. Int enums don't have an obvious string representation. Using the name as a string value is not intuitive (and would be really confusing when passing int enums to string parameters) and using the stringified backed int value would lead to collisions for most enums, as they are most likely to be continuous sequences starting from 0. --- Hi Rowan > Actually, I think this is already the case for "normal" objects - I had n= o idea that array_unique used a string cast to compare objects, so am very = surprised that it will not consider objects of completely different classes= unique, if they happen to have the same string value: https://3v4l.org/UGC= vB The default string strategy is not great. Unfortunately, changing this is likely not an option as there is no clear migration path. > Making backed enums work with their backing value would be equally confus= ing to me - Day::MONDAY and Month::JANUARY might both be backed by a 1, but= they are certainly distinct values. I'd much rather get an error that made= me check the manual and find a flag than have one of them silently discard= ed. I agree that a warning/error when comparing incompatible values would be optimal. This is however off-topic and would requrie an RFC, which I was originally hoping to avoid here. --- HI Levi > In my opinion, adding another flag isn't the _real_ fix. Any function > which does comparisons should take a callable for users to provide any > comparison they wish. An iteratively better API would be: > > function array_unique(list $array, callable(T $a, T $b): int > $comparator); IMO the fact that the array is sorted (in some cases, not all cases) is an implementation detail. The identity implementation proposed here does not sort the array but uses a hashmap internally. A php_compare function most likely isn't going to be useful outside of array_unique in which case it likely shouldn't be generally available, especially if it isn't even used internally but optimized out. > Of course, this complaining doesn't fix the situation we are in. My > first impression is that might be better to provide one or more > alternative functions and to deprecate `array_unique`. The question is if adding this option makes array_unique worse. I don't think so. I agree that a dedicated new array_unique with saner default values might be a good idea, but these two options are mutually exclusive. I'm not convinced that providing a version that accepts a closure is necessary. Nowadays the vast majority of cases should use identity semantics. Removing duplicates with different comparison strategies leads to randomness. For ['1e3', 1000] both '1e3' and 1000 are valid candidates for removal. It would make more sense to canonicalize the array values, like converting them to integers in this case, before then using identity semantics. --- Anyway, as the feedback wasn't completely unanimous I will create an RFC for this proposal. Ilija