Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:126990 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 3B8111A00BC for ; Mon, 31 Mar 2025 22:14:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1743459146; bh=K4JrAwHkzv4YAqxgOFoLcXPtPsIS699vfw9buLtistw=; h=Date:Subject:To:References:From:In-Reply-To:From; b=mhmUY+4YA2H8RaT5ycuSjiDfxbxuHuUCo7SdkJTy4T4qoIYltzaPfBVcDdKGY8xZ9 kthtN/6aN6BLOoE3miFCzxPziIulYcHWneUeL2fPzjvGoOR8KSWPjRxoBnJAjYCl5d TFgNEo3K5UeFTdc7133xmqNjMpy7QAExDPlicY4UW9lp4799XHx7FM6p+vghqPUAQl xUhm3o7tszVg51vkB9i77Sb29apCKa/6tM9Mt55KDjEqAr83Dt8FF134j8ptrMU9LO uI2LnbIeAVFWuXc0z4fWANAQ1MLsK/A/tFbTg8eFF2AUhiKnhYKo5wxz8CCOHBQemG bjZOgtAR1rWlQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 135F71801EC for ; Mon, 31 Mar 2025 22:12:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_40,DKIM_SIGNED, DKIM_VALID,DMARC_MISSING,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 31 Mar 2025 22:12:25 +0000 (UTC) Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-399744f74e9so3219564f8f.1 for ; Mon, 31 Mar 2025 15:14:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=scriptfusion-com.20230601.gappssmtp.com; s=20230601; t=1743459291; x=1744064091; darn=lists.php.net; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id:from :to:cc:subject:date:message-id:reply-to; bh=QCboxUxDoexA3XwXulaCmGTD0GRSywx0X8X/ztQA0Yc=; b=Qe9isFHc5HBAYE9ETTbG2lxbIhk+DaeL6jg8Er8A7Ya5JFBjIjIYwLJbbaTCU42uDW 0rPvgQDuVr/0TzEW4WYydcbe29dD838RZk8DMnTN7HfSeQwDhzHmlFTGlygceWYD4YIJ P/WPgSyNzolY6ewE0v/1Uczti87mw/5c0uKv/oaSm4IciXJdb21GxDxfg/7ZDwO2CdsG vCHi6XQSqlfJKXNG4vPYU74De2I/m5mHXPwSvTCpiYM7u5OPuVZF33MVxMU/xfmpd0kM 9xUW6eBtuYAuNZocDcNpdN8L41ilt5b57Okv+wtX1YT8r3bLYwt9IhH+qzUJbbHLTyd9 QkUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743459291; x=1744064091; h=content-transfer-encoding:in-reply-to:from:content-language :references:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QCboxUxDoexA3XwXulaCmGTD0GRSywx0X8X/ztQA0Yc=; b=fnFEEhSRav5cvlt33J2mmRLODRZxsMU7oOwtpaQEehwEx4l6rMl00ICp/Kc37OSqa6 0hG3o0Nv+ZwFI/ysVwsKYEvrrb6tqMgK2QV4Bwyl3xNsIYHGiRgcBPjQXmNU4+vzBYiP U/4pzgOUtfeNU1iXHtTmWmToIG3QmK4h8i0qQ5aDtUQo5vMLgeS4h3U5cuekFLjswcmU TPIo47U0cQMNHaFW+JW/K/zcVNHu7wQclza+Cq0yV5PH/xMYM4y6OJKsYa9IdfdOHWir jrnyuZYo11sJ+PooFDRoFBS7hsmuTvXDLDj2VZesHCLay4KeINr8FuAZRKE0mwSsFPFU sCCA== X-Gm-Message-State: AOJu0Yw/XyDmAROx3hKj5QCVrVhPNwQZ+07NAsPxsI8L+ooaeTWSLqLG 09y7AFFm7QxCr9En1a9aggZQgRN7rZwMC/bSqodzi0UQdFuOqtm8AfhsIGXqLZIyLIsyE2FT4aN p X-Gm-Gg: ASbGncs/w55Rqs+0Sn3iTg4WlZush5VYju1vC54OEDS9fexDaZeYE1cRb4lzRvGEEcH hzne6anTE5yxqV+zeS94rW/oeBu1BEDIdIiGdlelM6tGJ5zwnwRezSPAjPz5ytyQokXJ6v6gsw/ FwUdoS9l1o/u5x7T2FoW10U5hjv1tK0lqMI0l9acA5FcTi7UrEhGRi35TtpMin04/gk30DtIwla ngmWbbpCL4ujf2kG4EDB8LmtgZ9kpb4GLOk2gdwwWbA1vUV9YlFREJtluK8KtOqyI7uAmU5BUln NTE6qaAMoBE0paIG8awrE+Joc3qjMc1BRnJyvM+wuHEUBE9NAvSmPcTXOFRoiN0M9EVgZQkQBBB Q6ZlF+ndyXIWg51DaFU3GMYDVsc6/ X-Google-Smtp-Source: AGHT+IGwOJYAKuBNkaSbGV6X2t6Yca6LMrqF0U0YZDEHMheVZivRp2C1F3+tK07zaHaw9pTffhKf6A== X-Received: by 2002:a5d:5848:0:b0:390:fe8b:f442 with SMTP id ffacd0b85a97d-39c12119cc3mr7407292f8f.54.1743459291231; Mon, 31 Mar 2025 15:14:51 -0700 (PDT) Received: from ?IPV6:2a01:4b00:bf09:5101:3d81:13f1:de94:af56? ([2a01:4b00:bf09:5101:3d81:13f1:de94:af56]) by smtp.googlemail.com with ESMTPSA id ffacd0b85a97d-39c0b7a4498sm12669850f8f.99.2025.03.31.15.14.50 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 Mar 2025 15:14:50 -0700 (PDT) Message-ID: <079c7f2e-d992-4934-babb-39c21d5d4534@scriptfusion.com> Date: Mon, 31 Mar 2025 23:14:48 +0100 Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PHP-DEV] [RFC brainstorm] Approximately equals operator To: internals@lists.php.net References: <4a3c6ce7-102d-4cfe-a7a8-35630715b870@gmail.com> Content-Language: en-GB In-Reply-To: <4a3c6ce7-102d-4cfe-a7a8-35630715b870@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit From: bilge@scriptfusion.com (Bilge) On 31/03/2025 23:03, Niels Dossche wrote: > Hi internals! > > I'm excited to share what I've been working on! > I had an epiphany. I realized what we truly need to revolutionize PHP: a new operator. > > Hear me out. > We live in an imperfect world, and we often approximate data, but neither `==` nor `===` are ideal comparison operators to deal with these kinds of data. > > Introducing: the "approximately equal" (or "approx-equal") operator `~=` (to immitate the maths symbol ≃). > This combines the power of type coercion with approximating equality. > Who cares if things are actually equal, close enough amirite? > > First of all, if `$a == $b` holds, then `$a ~= $b` obviously. > The true power lies where the data is not exactly the same, but "close enough"! > > Here are some examples: > > We all had situations where we wanted to compare two floating point numbers and it turns out that due to the non-exact representation, seemingly-equal numbers don't match! Gone are those days because the `~=` operator nicely rounds the numbers for you before comparing them. > This also means that the "Fundamental Theorem of Engineering" now holds! > i.e. 2.7 ~= 3 and 3.14 ~= 3. Of course also 2.7 ~= 3.14. But this is false obviously: 2 ~= 1. > > Ever had trouble with users mistyping something? Say no more! > "This is a tpyo" ~= "This is a typo". It's typo-resistant! > However, if the strings are too different, then they're not approx-equal. > For example: "vanilla" ~= "strawberry" gives false. > How does this work? > * The strings are equal if their levenshtein ratio is <= 50%, so it's adaptive to the length. > * If the ratio is > 50%, then the shortest string comes first in the comparison, such that if we ever get a `~<` operator, then "vanilla" ~< "strawberry". > > There is of course a PoC implementation available at: https://github.com/php/php-src/pull/18214 > You can see more examples on GitHub in the tests, here is a copy: > ```php > // Number compares > var_dump(2 ~= 1); // false > var_dump(1.4 ~= 1); // true > var_dump(-1.4 ~= -1); // true > var_dump(-1.5 ~= -1.8); // true > var_dump(random_int(1, 1) ~= 1.1); // true > > // Array compares (just compares the lengths) > var_dump([1, 2, 3] ~= [2, 3, 4]); // true > var_dump([1, 2, 3] ~= [2, 3, 4, 5]); // false > > // String / string compares > var_dump("This is a tpyo" ~= "This is a typo"); // true > var_dump("something" ~= "different"); // false > var_dump("Wtf bro" ~= "Wtf sis"); // true > > // String / different type compares > var_dump(-1.5 ~= "-1.a"); // true > var_dump(-1.5 ~= "-1.aaaaaaa"); // false > var_dump(NULL ~= "blablabla"); // false > ``` > > Note that this does not support all possible Opcache optimizations _yet_, nor does it support the JIT yet. > However, there are no real blockers to add support for that. > > I look forward to hearing you! > > Have a nice first day of the month ;) > Kind regards > Niels For the float case it's fine (because Epsilon is well defined), but I think overloading for the string case is not fine, because the hard-coded 50% distance is subjective and users may well want to configure that, so an operator is thus not suitable, notwithstanding Levenshtein has very limited application. If there is any sense in doing string comparisons with this operator, I think the proposed case is not it. The array case is also not good in my view, where you're just comparing length; I see no use for that whatsoever. What it _should_ do instead is compare where order is indistinct, i.e. [1, 2, 3] ~= [3, 2, 1], similar to PHPUnit's assertEqualsCanonicalizing [1]. Cheers, Bilge [1]: https://github.com/sebastianbergmann/comparator/blob/d67eceae47e3956aa28ab0c6e43e5a6765f45779/src/ArrayComparator.php#L43-L46