Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:120474 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 76046 invoked from network); 30 May 2023 17:25:06 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 30 May 2023 17:25:06 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 1D82018054B for ; Tue, 30 May 2023 10:25:05 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Tue, 30 May 2023 10:25:04 -0700 (PDT) Received: by mail-ed1-f51.google.com with SMTP id 4fb4d7f45d1cf-51496f57e59so4295739a12.2 for ; Tue, 30 May 2023 10:25:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685467503; x=1688059503; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ocPhZLEbO4DGN0bvhNgx81fEPzE38PAQ7MpLF02N+9I=; b=U2NYzQXFK+Cl/ZG7+rgbb47768l/t/W65VM7Kak6kj3F3Z0sVJQ1qg5b0nXHiQ+yoi QIavAETUxBq1LlXmOvdVa2xi0T/4+N02oLbCYVVcLg46mYkDxah/yn5psB8trdOH1gWk F9VFPGzXAHr6AEqA+sTTk7BfuRviiaNR+fNQwxHDs+iP29peqdH5X4n5egbYSvYsBz0L nkJrGGZTMUDw33gOAgBDC9WPq3MPtIMNTgXFYqh235xaAk+zFNhNnVxXOqBlSnWuV1np NK0XEAutdcRyOxzjMOo8EcwaQ6oL9zPQ5QOuk+mdqRrcumAhgOVq2caxY/SwqZpCvhlp EFAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685467503; x=1688059503; h=to:references:message-id:content-transfer-encoding:cc:date :in-reply-to:from:subject:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ocPhZLEbO4DGN0bvhNgx81fEPzE38PAQ7MpLF02N+9I=; b=D8a7vZ3eOVjpo+SONRf5DxnhahJCpCf7V9tWOxuqYcwzk4xr56xcWah5Cg0YCZy4Bk SF3BrhzAsX5M1ydA1+d8Z/bNG/ptZEwKvfLBMQKF4mmh+Z9tohQqs0u0TLY3HaTEmC08 ZywYv1x4RttErBKF5UyEC91wgyI4r2az/scI9l81ZJZKv83fzIN3s/PxvaJzbWxU9+45 VmWM1WN2C2Yd285lVj2bxWerV0b44nbFwytwzc11LvJyyDTuPhmXG93in6S0uJn3EqTS ZqzyN0aylITYwL17QDNNS55GipuDf2CSgQLMZSrrnANUJ9ZLLmK5HJpr1JiQAZJqFtTK i6sA== X-Gm-Message-State: AC+VfDyW44VRzKZo6TS6Fh38EPcfT1xRiByQ7S5eYacq6a0FzUGGGi+W KV+OJ86gABz52tQQMT4/l5xBAe0FxMM= X-Google-Smtp-Source: ACHHUZ5RWJvZcwP+y8D+Obm2JZSd+VAGATurLa7xvtBkBcQzdf++9gfQDq1ttbU/3mxV9Ppb3B40MA== X-Received: by 2002:aa7:c59a:0:b0:514:9319:ebff with SMTP id g26-20020aa7c59a000000b005149319ebffmr2220285edq.24.1685467503080; Tue, 30 May 2023 10:25:03 -0700 (PDT) Received: from smtpclient.apple ([46.217.211.121]) by smtp.gmail.com with ESMTPSA id d23-20020a50fe97000000b0051458c4ae68sm4596895edt.77.2023.05.30.10.25.02 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 30 May 2023 10:25:02 -0700 (PDT) Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.600.7\)) In-Reply-To: Date: Tue, 30 May 2023 19:24:51 +0200 Cc: internals@lists.php.net Content-Transfer-Encoding: quoted-printable Message-ID: <9D69B8D4-835A-47C0-A3BD-64285849BB02@gmail.com> References: <289E585B-EF8B-4B17-89BE-BE8295FD9FE1@gmail.com> <4CA1668E-A342-452E-A994-5839C377CB27@gmail.com> To: Andreas Hennings X-Mailer: Apple Mail (2.3731.600.7) Subject: Re: [PHP-DEV] [RFC] [Discussion] Add new function `array_group` From: buritomath@gmail.com (Boro Sitnikovski) Hi, > On 30.5.2023, at 18:33, Andreas Hennings wrote: >=20 > On Tue, 30 May 2023 at 18:27, Boro Sitnikovski = wrote: >>=20 >> Hi, >>=20 >> Thank you for your thoughts. >>=20 >>> I would say the more common desired behavior is the one in your = first >>> example. And even for that we don't have a native function. >>=20 >> This Google search might give more insight into the number of = discussions about a grouping functionality: = https://www.google.com/search?q=3Dphp+group+elements+site:stackoverflow.co= m >=20 > All of the examples I looked at are asking for the first kind of > grouping, that can be implemented as in your first example. > In all the examples, if two items are equal, they end up in the same = group. >=20 > In your proposed behavior, equal items can end up in distinct groups > depending on their original position in the source array. > I don't see any questions or examples that ask for this. This is correct, although, if the array is sorted initially (and = depending on which operation and what we want to do), we can still solve = the same problem by using equality check. The idea is that `array_group` is more general since it works with = operators other than `=3D=3D`, whereas the hashmap approach is only = limited to equality check. A good illustration of this is the increasing subsequences problem, or = any other problem of similar nature. Here's some more examples: 1. Use `array_group` to create list of singleton list: ``` $groups =3D array_group( $arr, function( $p1, $p2 ) { return false; } ); ``` (This can also be achieved with `array_map` returning `[ $x ]`) 2. Distinct groups for consecutive positive and negative elements ``` $arr =3D [-1,2,-3,-4,2,1,2,-3,1,1,2]; $groups =3D array_group( $arr, function( $p1, $p2 ) { return ($p1 > 0) =3D=3D ($p2 > 0); } ); ``` This produces `[[-1],[2],[-3,-4],[2,1,2],[-3],[1,1,2]]`, so we can = easily capture the groups of highs/lows for example. 3. Group sentences (similar to `explode`, but still different) ``` $arr =3D "Hello, PHP. Good to see you."; $groups =3D array_group( str_split( $arr ), function( $p1, $p2 ) { return '.' !=3D=3D $p1; } ); $groups =3D array_map( 'join', $groups ); ``` Producing `[ "Hello, PHP.", " Good to see you." ]`. 4. Grouping book sections ``` $book_sections =3D [ '1.0', '1.1', '1.2', '2.0', '2.1', '3.0', '3.1' ]; $groups =3D array_group( $book_sections, function( $p1, $p2 ) { return $p1[0] =3D=3D=3D $p2[0]; } ); ``` Producing `[ [ '1.0', '1.1', '1.2' ], [ '2.0', '2.1'], [ '3.0', '3.1' ] = ]` and so on... Basically, it's a very general utility :) Best, Boro >=20 > -- Andreas >=20 >>=20 >>> Your behavior can be implemented in userland like so: >>> https://3v4l.org/epvHm >>=20 >> Correct, but then again, we can also implement = `array_map`/`array_filter`/etc. in userland :) >>=20 >>> I think you need to make a case as to why the behavior you describe >>> justifies a native function. >>=20 >> Similar to my previous answer, but also in general - ease of access = and also performance. >>=20 >>> E.g. if you find a lot of public php code that does this kind of = grouping. >>>=20 >>> I personally suspect it is not that common. >>>=20 >>> Cheers >>> Andreas >>>=20 >>>=20 >>> On Tue, 30 May 2023 at 17:08, Boro Sitnikovski = wrote: >>>>=20 >>>> Hey, >>>>=20 >>>> Thanks for the suggestion. >>>>=20 >>>> For the previous case in the code, I added these in a Gist to not = clutter here too much: >>>>=20 >>>> 1. The first example corresponds to = https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_ma= nual_group-php >>>> 2. The second example corresponds to = https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_ar= ray_group-php >>>> 3. Another example, addressing the problem of increasing = subsequences is very simple with `array_group`: = https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_ar= ray_incr_subseqs-php >>>>=20 >>>> Best, >>>>=20 >>>> Boro >>>>=20 >>>>> On 30.5.2023, at 16:57, Andreas Hennings = wrote: >>>>>=20 >>>>> Hello Boro, >>>>> I think you should include the "expected result" in your code = examples. >>>>> Maybe this is in your patch file, but I don't think we want to = look at >>>>> that for discussion. >>>>>=20 >>>>> Cheers >>>>> Andreas >>>>>=20 >>>>> On Tue, 30 May 2023 at 13:35, Boro Sitnikovski = wrote: >>>>>>=20 >>>>>> Hello all, >>>>>>=20 >>>>>> As per the How To Create an RFC instructions, I am sending this = e-mail in order to get your feedback on my proposal. >>>>>>=20 >>>>>> I propose introducing a function to PHP core named `array_group`. = This function takes an array and a function and returns an array that = contains arrays - groups of consecutive elements. This is very similar = to Haskell's `groupBy` function. >>>>>>=20 >>>>>> For some background as to why - usually, when people want to do = grouping in PHP, they use hash maps, so something like: >>>>>>=20 >>>>>> ``` >>>>>> >>>>> $array =3D [ >>>>>> [ 'id' =3D> 1, 'value' =3D> 'foo' ], >>>>>> [ 'id' =3D> 1, 'value' =3D> 'bar' ], >>>>>> [ 'id' =3D> 2, 'value' =3D> 'baz' ], >>>>>> ]; >>>>>>=20 >>>>>> $groups =3D []; >>>>>> foreach ( $array as $element ) { >>>>>> $groups[ $element['id'] ][] =3D $element; >>>>>> } >>>>>>=20 >>>>>> var_dump( $groups ); >>>>>> ``` >>>>>>=20 >>>>>> This can now be achieved as follows (not preserving keys): >>>>>>=20 >>>>>> ``` >>>>>> >>>>> $array =3D [ >>>>>> [ 'id' =3D> 1, 'value' =3D> 'foo' ], >>>>>> [ 'id' =3D> 1, 'value' =3D> 'bar' ], >>>>>> [ 'id' =3D> 2, 'value' =3D> 'baz' ], >>>>>> ]; >>>>>>=20 >>>>>> $groups =3D array_group( $array, function( $a, $b ) { >>>>>> return $a['id'] =3D=3D $b['id']; >>>>>> } ); >>>>>> ``` >>>>>>=20 >>>>>> The disadvantage of the first approach is that we are only = limited to using equality check, and we cannot group by, say, `<` or = other functions. >>>>>> Similarly, the advantage of the first approach is that the keys = are preserved, and elements needn't be consecutive. >>>>>>=20 >>>>>> In any case, I think a utility function such as `array_group` = will be widely useful. >>>>>>=20 >>>>>> Please find attached a patch with a proposed implementation. = Curious about your feedback. >>>>>>=20 >>>>>> Best, >>>>>>=20 >>>>>> Boro Sitnikovski >>>>>>=20 >>>>=20 >>=20