Hello all,
As per the How To Create an RFC https://wiki.php.net/rfc/howto instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell's groupBy
function https://hackage.haskell.org/package/groupBy-0.1.0.0/docs/Data-List-GroupBy.html.
For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php
$array = [
[ 'id' => 1, 'value' => 'foo' ],
[ 'id' => 1, 'value' => 'bar' ],
[ 'id' => 2, 'value' => 'baz' ],
];
$groups = [];
foreach ( $array as $element ) {
$groups[ $element['id'] ][] = $element;
}
var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php
$array = [
[ 'id' => 1, 'value' => 'foo' ],
[ 'id' => 1, 'value' => 'bar' ],
[ 'id' => 2, 'value' => 'baz' ],
];
$groups = array_group( $array, function( $a, $b ) {
return $a['id'] == $b['id'];
} );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say, <
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.
In any case, I think a utility function such as array_group
will be widely useful.
Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski https://people.php.net/bor0

Updated the patch: added a test about increasing subsequences example, and a minor bugfix.

Hello all,
As per the How To Create an RFC https://wiki.php.net/rfc/howto instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function https://hackage.haskell.org/package/groupBy-0.1.0.0/docs/Data-List-GroupBy.html.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski https://people.php.net/bor0
<array_group.patch
Updated the patch: added a test about increasing subsequences example, and a minor bugfix.
<array_group.patch>
Hello all,
As per the How To Create an RFC https://wiki.php.net/rfc/howto instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function https://hackage.haskell.org/package/groupBy-0.1.0.0/docs/Data-List-GroupBy.html.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski https://people.php.net/bor0
<array_group.patch>
Thank you all for the comments. I agree that there are many ways to do grouping, but based on the discussion here, I think we discussed two main grouping cases:
- The grouping that JavaScript/.NET/Lodash/Scala/etc. do (this should be the default of
array_group
) - The grouping that Haskell does, the one I proposed earlier (this can be altered in a flag within
array_group
)
Based on this, I'd like to adjust my initial proposal, where we would have the following function: function array_group(array $array, callable $callback, bool $consecutive_pairs = false): array {}
If the argument consecutive_pairs
is false, it will use the function's return value to do the grouping ($callback accepting single element in this case)
Otherwise, it will use the function's boolean return value to check if two consecutive elements need to be grouped ($callback accepting two elements in this case)
(This approach seems to be consistent with array_filter
in the sense the callback accepts one or two arguments)
With a few example usages:
var_dump( array_group($arr1, function( $x ) {
return (string) strlen( $x );
} ) );
// Producing ['3' => ['one', 'two'], '5' => ['three']]
Another one:
$arr = [-1,2,-3,-4,2,1,2,-3,1,1,2];
$groups = array_group( $arr, function( $p1, $p2 ) {
return ($p1 > 0) == ($p2 > 0);
} );
// Producing [[-1],[2],[-3,-4],[2,1,2],[-3],[1,1,2]]
I believe this proposal captures many use cases, beyond the examples we discussed. Curious about any other thoughts.
I'm also attaching a PoC patch that implements this.

- The grouping that JavaScript/.NET/Lodash/Scala/etc. do (this should
be the default ofarray_group
)- The grouping that Haskell does, the one I proposed earlier (this can
be altered in a flag withinarray_group
)Based on this, I'd like to adjust my initial proposal, where we would
have the following function:function array_group(array $array, callable $callback, bool $consecutive_pairs = false): array {}
If the argument
consecutive_pairs
is false, it will use the
function's return value to do the grouping ($callback accepting single
element in this case)
Otherwise, it will use the function's boolean return value to check if
two consecutive elements need to be grouped ($callback accepting two
elements in this case)(This approach seems to be consistent with
array_filter
in the sense
the callback accepts one or two arguments)With a few example usages:
var_dump( array_group($arr1, function( $x ) { return (string) strlen( $x ); } ) ); // Producing ['3' => ['one', 'two'], '5' => ['three']]
Another one:
$arr = [-1,2,-3,-4,2,1,2,-3,1,1,2]; $groups = array_group( $arr, function( $p1, $p2 ) { return ($p1 > 0) == ($p2 > 0); } ); // Producing [[-1],[2],[-3,-4],[2,1,2],[-3],[1,1,2]]
This sounds like two separate functions in a trenchcoat. It should be two separate functions.
I believe this proposal captures many use cases, beyond the examples we
discussed. Curious about any other thoughts.I'm also attaching a PoC patch that implements this.
Attachments:
- array_group.patch
Side note: I don't think anyone reads patches sent to the list. I didn't even realize it allowed attachments. :-) If you want someone to review code, a GitHub PR is the way to go.
You can also include benchmarks there. From experience, if you don't have a compelling reason why this needs to be in C rather than PHP (which in this case boils down to performance exclusively), you're not going to be able to convince people to add another random utility function. That's just the reality these days.
--Larry Garfield
- The grouping that JavaScript/.NET/Lodash/Scala/etc. do (this should
be the default ofarray_group
)- The grouping that Haskell does, the one I proposed earlier (this can
be altered in a flag withinarray_group
)Based on this, I'd like to adjust my initial proposal, where we would
have the following function:function array_group(array $array, callable $callback, bool $consecutive_pairs = false): array {}
If the argument
consecutive_pairs
is false, it will use the
function's return value to do the grouping ($callback accepting single
element in this case)
Otherwise, it will use the function's boolean return value to check if
two consecutive elements need to be grouped ($callback accepting two
elements in this case)(This approach seems to be consistent with
array_filter
in the sense
the callback accepts one or two arguments)With a few example usages:
var_dump( array_group($arr1, function( $x ) { return (string) strlen( $x ); } ) ); // Producing ['3' => ['one', 'two'], '5' => ['three']]
Another one:
$arr = [-1,2,-3,-4,2,1,2,-3,1,1,2]; $groups = array_group( $arr, function( $p1, $p2 ) { return ($p1 > 0) == ($p2 > 0); } ); // Producing [[-1],[2],[-3,-4],[2,1,2],[-3],[1,1,2]]
This sounds like two separate functions in a trenchcoat. It should be two separate functions.
I believe this proposal captures many use cases, beyond the examples we
discussed. Curious about any other thoughts.I'm also attaching a PoC patch that implements this.
Attachments:
- array_group.patch
Side note: I don't think anyone reads patches sent to the list. I didn't even realize it allowed attachments. :-) If you want someone to review code, a GitHub PR is the way to go.
You can also include benchmarks there. From experience, if you don't have a compelling reason why this needs to be in C rather than PHP (which in this case boils down to performance exclusively), you're not going to be able to convince people to add another random utility function. That's just the reality these days.
--Larry Garfield
--
To unsubscribe, visit: https://www.php.net/unsub.php
Thank you for your thoughts. We can definitely separate them in two functions. Also thanks on the note for the patches - will keep it in mind :)
Good point on the benchmarks, I will keep those in mind.
Beyond performance, I think this has to do with developer experience pretty much, too. Almost all mainstream languages have a built-in grouping function, and in my honest opinion, it's time PHP has one too :) I think we can all agree grouping is a very common/basic functionality that we all meet with regularly.
Hello Boro,
I think you should include the "expected result" in your code examples.
Maybe this is in your patch file, but I don't think we want to look at
that for discussion.
Cheers
Andreas
Hello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
Hey,
Thanks for the suggestion.
For the previous case in the code, I added these in a Gist to not clutter here too much:
- The first example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_manual_group-php
- The second example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_group-php
- Another example, addressing the problem of increasing subsequences is very simple with
array_group
: https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_incr_subseqs-php
Best,
Boro
Hello Boro,
I think you should include the "expected result" in your code examples.
Maybe this is in your patch file, but I don't think we want to look at
that for discussion.Cheers
AndreasHello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
Thank you, this clarifies and it confirms my initial assumption of
what you are proposing.
So you want to slice an array by comparing adjacent values.
My personal feedback:
I think the need for the grouping behavior you describe is not common
enough that it needs its own native function.
I would say the more common desired behavior is the one in your first
example. And even for that we don't have a native function.
Your behavior can be implemented in userland like so:
https://3v4l.org/epvHm
$arr1 = array(1,2,2,3,1,2,0,4,5,2);
$groups = [];
$group = [];
$prev = NULL;
foreach ($arr1 as $value) {
if ($group && $prev > $value) {
$groups[] = $group;
$group = [];
}
$group[] = $value;
$prev = $value;
}
if ($group) {
$groups[] = $group;
}
print_r($groups);
If needed, the comparison function can be separated out and passed as
a parameter.
So the array_group() function with a comparison callback parameter can
be implemented in userland.
I think you need to make a case as to why the behavior you describe
justifies a native function.
E.g. if you find a lot of public php code that does this kind of grouping.
I personally suspect it is not that common.
Cheers
Andreas
Hey,
Thanks for the suggestion.
For the previous case in the code, I added these in a Gist to not clutter here too much:
- The first example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_manual_group-php
- The second example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_group-php
- Another example, addressing the problem of increasing subsequences is very simple with
array_group
: https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_incr_subseqs-phpBest,
Boro
Hello Boro,
I think you should include the "expected result" in your code examples.
Maybe this is in your patch file, but I don't think we want to look at
that for discussion.Cheers
AndreasHello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
Here we go,
https://3v4l.org/KsL3o
function array_group(array $arr1, callable $compare): array {
$groups = [];
$group = [];
$prev = NULL;
foreach ($arr1 as $value) {
if ($group && !$compare($prev, $value)) {
$groups[] = $group;
$group = [];
}
$group[] = $value;
$prev = $value;
}
if ($group) {
$groups[] = $group;
}
return $groups;
}
Thank you, this clarifies and it confirms my initial assumption of
what you are proposing.
So you want to slice an array by comparing adjacent values.My personal feedback:
I think the need for the grouping behavior you describe is not common
enough that it needs its own native function.
I would say the more common desired behavior is the one in your first
example. And even for that we don't have a native function.Your behavior can be implemented in userland like so:
https://3v4l.org/epvHm$arr1 = array(1,2,2,3,1,2,0,4,5,2);
$groups = [];
$group = [];
$prev = NULL;
foreach ($arr1 as $value) {
if ($group && $prev > $value) {
$groups[] = $group;
$group = [];
}
$group[] = $value;
$prev = $value;
}
if ($group) {
$groups[] = $group;
}print_r($groups);
If needed, the comparison function can be separated out and passed as
a parameter.
So the array_group() function with a comparison callback parameter can
be implemented in userland.I think you need to make a case as to why the behavior you describe
justifies a native function.
E.g. if you find a lot of public php code that does this kind of grouping.I personally suspect it is not that common.
Cheers
AndreasHey,
Thanks for the suggestion.
For the previous case in the code, I added these in a Gist to not clutter here too much:
- The first example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_manual_group-php
- The second example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_group-php
- Another example, addressing the problem of increasing subsequences is very simple with
array_group
: https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_incr_subseqs-phpBest,
Boro
Hello Boro,
I think you should include the "expected result" in your code examples.
Maybe this is in your patch file, but I don't think we want to look at
that for discussion.Cheers
AndreasHello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
Hi,
Thank you for your thoughts.
I would say the more common desired behavior is the one in your first
example. And even for that we don't have a native function.
This Google search might give more insight into the number of discussions about a grouping functionality: https://www.google.com/search?q=php+group+elements+site:stackoverflow.com
Your behavior can be implemented in userland like so:
https://3v4l.org/epvHm
Correct, but then again, we can also implement array_map
/array_filter
/etc. in userland :)
I think you need to make a case as to why the behavior you describe
justifies a native function.
Similar to my previous answer, but also in general - ease of access and also performance.
E.g. if you find a lot of public php code that does this kind of grouping.
I personally suspect it is not that common.
Cheers
AndreasHey,
Thanks for the suggestion.
For the previous case in the code, I added these in a Gist to not clutter here too much:
- The first example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_manual_group-php
- The second example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_group-php
- Another example, addressing the problem of increasing subsequences is very simple with
array_group
: https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_incr_subseqs-phpBest,
Boro
Hello Boro,
I think you should include the "expected result" in your code examples.
Maybe this is in your patch file, but I don't think we want to look at
that for discussion.Cheers
AndreasHello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
Hi,
Thank you for your thoughts.
I would say the more common desired behavior is the one in your first
example. And even for that we don't have a native function.This Google search might give more insight into the number of discussions about a grouping functionality: https://www.google.com/search?q=php+group+elements+site:stackoverflow.com
All of the examples I looked at are asking for the first kind of
grouping, that can be implemented as in your first example.
In all the examples, if two items are equal, they end up in the same group.
In your proposed behavior, equal items can end up in distinct groups
depending on their original position in the source array.
I don't see any questions or examples that ask for this.
-- Andreas
Your behavior can be implemented in userland like so:
https://3v4l.org/epvHmCorrect, but then again, we can also implement
array_map
/array_filter
/etc. in userland :)I think you need to make a case as to why the behavior you describe
justifies a native function.Similar to my previous answer, but also in general - ease of access and also performance.
E.g. if you find a lot of public php code that does this kind of grouping.
I personally suspect it is not that common.
Cheers
AndreasHey,
Thanks for the suggestion.
For the previous case in the code, I added these in a Gist to not clutter here too much:
- The first example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_manual_group-php
- The second example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_group-php
- Another example, addressing the problem of increasing subsequences is very simple with
array_group
: https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_incr_subseqs-phpBest,
Boro
Hello Boro,
I think you should include the "expected result" in your code examples.
Maybe this is in your patch file, but I don't think we want to look at
that for discussion.Cheers
AndreasHello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
Hi,
Hi,
Thank you for your thoughts.
I would say the more common desired behavior is the one in your first
example. And even for that we don't have a native function.This Google search might give more insight into the number of discussions about a grouping functionality: https://www.google.com/search?q=php+group+elements+site:stackoverflow.com
All of the examples I looked at are asking for the first kind of
grouping, that can be implemented as in your first example.
In all the examples, if two items are equal, they end up in the same group.In your proposed behavior, equal items can end up in distinct groups
depending on their original position in the source array.
I don't see any questions or examples that ask for this.
This is correct, although, if the array is sorted initially (and depending on which operation and what we want to do), we can still solve the same problem by using equality check.
The idea is that array_group
is more general since it works with operators other than ==
, whereas the hashmap approach is only limited to equality check.
A good illustration of this is the increasing subsequences problem, or any other problem of similar nature.
Here's some more examples:
- Use
array_group
to create list of singleton list:
$groups = array_group( $arr, function( $p1, $p2 ) {
return false;
} );
(This can also be achieved with array_map
returning [ $x ]
)
- Distinct groups for consecutive positive and negative elements
$arr = [-1,2,-3,-4,2,1,2,-3,1,1,2];
$groups = array_group( $arr, function( $p1, $p2 ) {
return ($p1 > 0) == ($p2 > 0);
} );
This produces [[-1],[2],[-3,-4],[2,1,2],[-3],[1,1,2]]
, so we can easily capture the groups of highs/lows for example.
- Group sentences (similar to
explode
, but still different)
$arr = "Hello, PHP. Good to see you.";
$groups = array_group( str_split( $arr ), function( $p1, $p2 ) {
return '.' !== $p1;
} );
$groups = array_map( 'join', $groups );
Producing [ "Hello, PHP.", " Good to see you." ]
.
- Grouping book sections
$book_sections = [ '1.0', '1.1', '1.2', '2.0', '2.1', '3.0', '3.1' ];
$groups = array_group( $book_sections, function( $p1, $p2 ) {
return $p1[0] === $p2[0];
} );
Producing [ [ '1.0', '1.1', '1.2' ], [ '2.0', '2.1'], [ '3.0', '3.1' ] ]
and so on...
Basically, it's a very general utility :)
Best,
Boro
-- Andreas
Your behavior can be implemented in userland like so:
https://3v4l.org/epvHmCorrect, but then again, we can also implement
array_map
/array_filter
/etc. in userland :)I think you need to make a case as to why the behavior you describe
justifies a native function.Similar to my previous answer, but also in general - ease of access and also performance.
E.g. if you find a lot of public php code that does this kind of grouping.
I personally suspect it is not that common.
Cheers
AndreasHey,
Thanks for the suggestion.
For the previous case in the code, I added these in a Gist to not clutter here too much:
- The first example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_manual_group-php
- The second example corresponds to https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_group-php
- Another example, addressing the problem of increasing subsequences is very simple with
array_group
: https://gist.github.com/bor0/b5f449bfe85440d96abd933b9f03b310#file-test_array_incr_subseqs-phpBest,
Boro
Hello Boro,
I think you should include the "expected result" in your code examples.
Maybe this is in your patch file, but I don't think we want to look at
that for discussion.Cheers
AndreasHello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
Hi,
Thank you for your thoughts.
I would say the more common desired behavior is the one in your first
example. And even for that we don't have a native function.This Google search might give more insight into the number of
discussions about a grouping functionality:
https://www.google.com/search?q=php+group+elements+site:stackoverflow.comYour behavior can be implemented in userland like so:
https://3v4l.org/epvHmCorrect, but then again, we can also implement
array_map
/array_filter
/etc. in userland :)I think you need to make a case as to why the behavior you describe
justifies a native function.Similar to my previous answer, but also in general - ease of access and
also performance.
Do you have benchmarks showing that implementing it in C would be notably faster? That would help the case that it should be written in C.
Also, please do not top-post.
--Larry Garfield
Hi,
Thank you for your thoughts.
I would say the more common desired behavior is the one in your first
example. And even for that we don't have a native function.This Google search might give more insight into the number of
discussions about a grouping functionality:
https://www.google.com/search?q=php+group+elements+site:stackoverflow.comYour behavior can be implemented in userland like so:
https://3v4l.org/epvHmCorrect, but then again, we can also implement
array_map
/array_filter
/etc. in userland :)I think you need to make a case as to why the behavior you describe
justifies a native function.Similar to my previous answer, but also in general - ease of access and
also performance.Do you have benchmarks showing that implementing it in C would be notably faster? That would help the case that it should be written in C.
Also, please do not top-post.
--Larry Garfield
--
To unsubscribe, visit: https://www.php.net/unsub.php
Sorry for the top-posting, my bad.
I ran some performance benchmarks and the improvement is about 25% - some details on the following gist: https://gist.github.com/bor0/3fa539263335fa415faa67606a469f2e?permalink_comment_id=4584327#gistcomment-4584327
I propose introducing a function to PHP core named
array_group
. This
function takes an array and a function and returns an array that
contains arrays - groups of consecutive elements. This is very similar
to Haskell'sgroupBy
function
https://hackage.haskell.org/package/groupBy-0.1.0.0/docs/Data-List-GroupBy.html.
Sorry, I'm sick right now and have no clue, but how does it relate to
https://wiki.php.net/rfc/array_column_results_grouping
--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]
PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
I propose introducing a function to PHP core named
array_group
. This
function takes an array and a function and returns an array that
contains arrays - groups of consecutive elements. This is very similar
to Haskell'sgroupBy
function
https://hackage.haskell.org/package/groupBy-0.1.0.0/docs/Data-List-GroupBy.html.Sorry, I'm sick right now and have no clue, but how does it relate to
https://wiki.php.net/rfc/array_column_results_grouping--
Aleksander Machniak
Kolab Groupware Developer [https://kolab.org]
Roundcube Webmail Developer [https://roundcube.net]PGP: 19359DC1 # Blog: https://kolabian.wordpress.com
--
To unsubscribe, visit: https://www.php.net/unsub.php
Okay, I will just share my decade-plus-old take on this problem for
anyone interested.
I find defining the desired array grouping rules is self-explanatory -
and can get arbitrarily complex when required - in this format:
columnA=>*
.
The format can get as complex as columnA[columnB][]=>columnC,columnA
- and more - and I bet you can decipher what you get from that without
much difficulty nor documentation. It's also quite simple to compose
your own.
Here's the whole source code for the PHP function along with test coverage:
https://github.com/laravel/framework/discussions/45638
Cheers
As an answerer and curator of many [php][arrays][grouping] tagged questions
on Stack Overflow for several years, I'd like to mention that developers'
non-SQL grouping needs are much more nuanced than merely defining group
qualification and creating subarrays.
Often devs will want to sum, count (increment a counter), concatenate
another column (or columns), or otherwise mutate the encountered row in
each group.
Have you considered the extendability of your pitched function? Without
these extra considerations, devs would need to make another pass over the
new nested structure.
mickmackusa
Hello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
I'm sorry if one of the many replies already mentioned this, but there
was a failed RFC for array_group
:
https://wiki.php.net/rfc/array_column_results_grouping. Note that I
voted against it for technical reasons (the signature was just awful,
we can do better), but I am not against the idea of adding such helper
functions in principle.
Hello all,
As per the How To Create an RFC instructions, I am sending this e-mail in order to get your feedback on my proposal.
I propose introducing a function to PHP core named
array_group
. This function takes an array and a function and returns an array that contains arrays - groups of consecutive elements. This is very similar to Haskell'sgroupBy
function.For some background as to why - usually, when people want to do grouping in PHP, they use hash maps, so something like:
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = []; foreach ( $array as $element ) { $groups[ $element['id'] ][] = $element; } var_dump( $groups );
This can now be achieved as follows (not preserving keys):
<?php $array = [ [ 'id' => 1, 'value' => 'foo' ], [ 'id' => 1, 'value' => 'bar' ], [ 'id' => 2, 'value' => 'baz' ], ]; $groups = array_group( $array, function( $a, $b ) { return $a['id'] == $b['id']; } );
The disadvantage of the first approach is that we are only limited to using equality check, and we cannot group by, say,
<
or other functions.
Similarly, the advantage of the first approach is that the keys are preserved, and elements needn't be consecutive.In any case, I think a utility function such as
array_group
will be widely useful.Please find attached a patch with a proposed implementation. Curious about your feedback.
Best,
Boro Sitnikovski
I'm sorry if one of the many replies already mentioned this, but there
was a failed RFC forarray_group
:
https://wiki.php.net/rfc/array_column_results_grouping. Note that I
voted against it for technical reasons (the signature was just awful,
we can do better), but I am not against the idea of adding such helper
functions in principle.
Folks did mention that RFC, but nobody mentioned the rejection reasons specifically. Thank you for the information and encouragement! I was a bit scared of this one being rejected too, but still decided to give it a shot :)
Thank you for the information and encouragement! I was a bit scared of this one being rejected too, but still decided to give it a shot :)
As an alternative approach this is what I would do in your situation:
I would start with a PHP implementation, gather some opinions on what this method would do, how it should work. Finalising the functional specification to a point where you find broad acceptance. This would also allow others to help performance optimise your implementation.
Then start a PECL extension implementing the exact same thing in C code. This could start as a niche for people that really need the extra performance the extension would bring, possibly grow out to something that is defacto standaard.
PECL extensions are documented in the PHP manual, so the manual could have your array_group() function described, possibly with a reference to the original PHP implementation/ polyfill.
This approach minimises dependence on internals giving a green light and maximises the possibility to work with others as a team on this. It would give you a chance to prove that the solution you provide is actually wanted by users. If that really is the case, if it came to vote to merging the extension into PHP itself the vote could be different than the vote on the former RFC.
Greetings, Casper
Thank you for the information and encouragement! I was a bit scared of this one being rejected too, but still decided to give it a shot :)
As an alternative approach this is what I would do in your situation:
I would start with a PHP implementation, gather some opinions on what this method would do, how it should work. Finalising the functional specification to a point where you find broad acceptance. This would also allow others to help performance optimise your implementation.
Then start a PECL extension implementing the exact same thing in C code. This could start as a niche for people that really need the extra performance the extension would bring, possibly grow out to something that is defacto standaard.
PECL extensions are documented in the PHP manual, so the manual could have your array_group() function described, possibly with a reference to the original PHP implementation/ polyfill.
This approach minimises dependence on internals giving a green light and maximises the possibility to work with others as a team on this. It would give you a chance to prove that the solution you provide is actually wanted by users. If that really is the case, if it came to vote to merging the extension into PHP itself the vote could be different than the vote on the former RFC.
Greetings, Casper
Thank you for the suggestion, I like this approach and it's definitely much "safer" than going with an RFC for core directly.
What are your thoughts on creating a PECL extension called array_utils
(selling point would be high performance array utils or something), which in the future might contain more than array_group*
, and the approach would be to cherry-pick those functions that have frequent usage in codebases into core? Or would it be better to stick to a particular/more concrete extension?
Thank you for the suggestion, I like this approach and it's definitely much "safer" than going with an RFC for core directly.
What are your thoughts on creating a PECL extension called
array_utils
(selling point would be high performance array utils or something), which in the future might contain more thanarray_group*
, and the approach would be to cherry-pick those functions that have frequent usage in codebases into core? Or would it be better to stick to a particular/more concrete extension?
I don't know. Also, I have no way of knowing if this would work. Although I mail using my php.net email address, and that could convey some authority, I really do not have that.
My contribution to PHP is limited to maintaining the PECL ssh2 extension. Most of what I do is merging stuff other people write, keep an eye on bugs, Toss out GitHub issues when they are not relevant or lack real information, and update documentation.
Sometimes chances to extension code are required by changes in the PHP interface. Mostly it's all minor stuff, and not much work.
If you end up at the stage where you start writing C code, let me know if you think I can be of value to you.
I really value other people's contributions in open source projects and enjoy contributing to create a better working development community. I think there can be value in every step that can be taken in developing software. But you should be very realistic in your expectations of code ending up in PHP core.
If you can enjoy the process, see the value in the steps in between, if you are prepared to learn stuff that everyone around here already seems to know but somehow is not written down anywhere except in the code of php or other extensions, it is really fun and will bring you insights that are invaluable in working with PHP.
Greetings, Casper