I understand that array_filter()
should costs more than for/foreach because
it is a function call that call another function for each item, while the
for/foreach is a language constructor that works on a way totally different
and its optimized for this kind of job.
foreach($items as $item) { if(expr($item)) { $result[] = $item } }
array_filter($items, function($item) { return expr($item) } )
But there are some possibility to create some runtime optimizer for cases
like that, and "understand" that array_filter()
should works like
for/foreach? JS have some kind of optimization and functions is faster than
constructor for cases like that: http://jsben.ch/pZmLf.
I have done some benchmarks with a 60 seconds cycle.
https://pastebin.com/zGmE4pxm - for/foreach -> 17.06 mi cycles
https://pastebin.com/5E2VwHYm - array (prepared function) ->
5.49 mi cycles (-67.82%)
https://pastebin.com/jSA9ZBqt - array (prepared var function) -> 3.30 mi
cycles (-80.66%)
https://pastebin.com/YPdCmphJ - array_filter (inline) -> 3.14 mi cycles
(-81.58%)
I think that array_filter()
cost will keep the same, once that it is a
function and I belive that it is already optimized, but the callable could
be optimized to "not be a function internally", working like a constructor.
And maybe it could be applied to other functions like array_map()
.
--
David Rodrigues
I understand that
array_filter()
should costs more than for/foreach because
it is a function call that call another function for each item, while the
for/foreach is a language constructor that works on a way totally different
and its optimized for this kind of job.
What you're describing is called inlining. At the moment, the main
barrier to inlining is the symbol scopes.
- Any variables used within the closure must not collide with the
calling scope unless:
1.a: The value is captured by ref, or
1.b: The value has been captured by val and is reset between invocations. - Variables within the closure must destruct at the end of each
invocation (loop).
If you're seriously interested, I'd recommend prototyping an
optimizing using the zend_ast_process hook.
-Sara
Hi,
I understand that
array_filter()
should costs more than for/foreach
because
it is a function call that call another function for each item, while
the
for/foreach is a language constructor that works on a way totally
different
and its optimized for this kind of job.
The main cost in that is that array_filter has to create a copy of the
array. An alternative is using an FilterIterator or generator or such
which would make this a pipeline step. (with iterators there are
multiple function calls, while the function descriptor is cached,
making them relatively cheap) I'd focus on improving those.
johannes
On Fri, Sep 8, 2017 at 9:06 AM, Johannes Schlüter
johannes@schlueters.de wrote:
Hi,
I understand that
array_filter()
should costs more than for/foreach
because
it is a function call that call another function for each item, while
the
for/foreach is a language constructor that works on a way totally
different
and its optimized for this kind of job.The main cost in that is that array_filter has to create a copy of the
array. An alternative is using an FilterIterator or generator or such
which would make this a pipeline step. (with iterators there are
multiple function calls, while the function descriptor is cached,
making them relatively cheap) I'd focus on improving those.
As far as I can tell from the source array_filter
does not create a
copy of the array. It initializes a new array and then appends to it.
If we want to improve the performance of array_filter
we need to
examine output of common compilers to see if they hoist the
have_callback check out of the loop; I suspect it may not given the
complexity of ZEND_HASH_FOREACH_KEY_VAL_IND.
On Fri, Sep 8, 2017 at 8:06 AM, Johannes Schlüter
johannes@schlueters.de wrote:
I understand that
array_filter()
should costs more than for/foreach
because
it is a function call that call another function for each item, while
the
for/foreach is a language constructor that works on a way totally
different
and its optimized for this kind of job.The main cost in that is that array_filter has to create a copy of the
array. An alternative is using an FilterIterator or generator or such
which would make this a pipeline step. (with iterators there are
multiple function calls, while the function descriptor is cached,
making them relatively cheap) I'd focus on improving those.
cough
https://github.com/phplang/generator/blob/master/src/iterable.php#L64
cough
-Sara