Hello There,
I would like to propose a third argument to implode()
, named
skip_empty, that will cause empty elements to be ignored when
generating the implode string. By empty I mean everything that
converts to an empty string such as '', false, null, etc.
For example:
<?php
$a = array(1,'',2,null,3,0,4,true,5,false,false);
echo implode(',', $a); // old behavior
echo "\n";
echo implode(',', $a, true); // new feature
echo "\n";
echo $foo;
echo "\n";
?>
Will output:
1,,2,,3,0,4,1,5,,
1,2,3,0,4,1,5
Obviously the new parameter defaults to 'false' for backwards compatibility.
The patch file is attached for evaluation, review, feedback etc.
Please keep in mind that this is my first time playing with PHP source
so expect flaws on my code.
I did a few simple tests (using time
, if someone has a non buggy
valgrind for OS X please let me know) against 5.3 CVS (named php53 in
examples) and my patched version (named php+ in examples). The PHP
scripts are right below the tests. Looking at the results I would say
there is almost no performance loss.
Before I place a feature request at PHP.net I would love to hear some
feedback here, or even a yes/no for this feature.
Best Regards,
Igor Feghali.
=== TEST 1 ===
$ time ./php53 /tmp/more_small.php
real 0m0.257s
user 0m0.223s
sys 0m0.032s
$ time ./php53 /tmp/more_small.php
real 0m0.258s
user 0m0.224s
sys 0m0.032s
$ time ./php53 /tmp/more_small.php
real 0m0.260s
user 0m0.225s
sys 0m0.032s
$ time ./php+ /tmp/more_small.php
real 0m0.261s
user 0m0.226s
sys 0m0.033s
$ time ./php+ /tmp/more_small.php
real 0m0.258s
user 0m0.224s
sys 0m0.032s
$ time ./php+ /tmp/more_small.php
real 0m0.260s
user 0m0.225s
sys 0m0.033s
=== TEST 2 ===
$ time ./php53 /tmp/less_big.php
real 0m0.328s
user 0m0.205s
sys 0m0.120s
$ time ./php53 /tmp/less_big.php
real 0m0.328s
user 0m0.206s
sys 0m0.120s
$ time ./php53 /tmp/less_big.php
real 0m0.326s
user 0m0.205s
sys 0m0.119s
$ time ./php+ /tmp/less_big.php
real 0m0.330s
user 0m0.206s
sys 0m0.122s
$ time ./php+ /tmp/less_big.php
real 0m0.330s
user 0m0.206s
sys 0m0.121s
$ time ./php+ /tmp/less_big.php
real 0m0.333s
user 0m0.207s
sys 0m0.124s
=== TEST 3 ===
$ time ./php+ /tmp/lots_null.php
real 0m0.263s
user 0m0.229s
sys 0m0.032s
$ time ./php+ /tmp/lots_null.php
real 0m0.261s
user 0m0.227s
sys 0m0.032s
$ time ./php+ /tmp/lots_null.php
real 0m0.260s
user 0m0.226s
sys 0m0.032s
$ time ./php+ /tmp/no_null.php
real 0m0.259s
user 0m0.226s
sys 0m0.032s
$ time ./php+ /tmp/no_null.php
real 0m0.257s
user 0m0.223s
sys 0m0.031s
$ time ./php+ /tmp/no_null.php
real 0m0.264s
user 0m0.229s
sys 0m0.032s
=== SCRIPTS ===
less_big.php
<?php
$str = str_repeat(str_repeat('foo', 1000).',', 11345);
$arr = explode(',', $str);
$out = implode(',', $arr);
lots_null.php
<?php
$str = str_repeat('foo,,', 99999);
$arr = explode(',', $str);
$out = implode(',', $arr);
more_small.php
<?php
$str = str_repeat('foo,bar,', 99999);
$arr = explode(',', $str);
$out = implode(',', $arr);
no_null.php
<?php
$str = str_repeat('foo,,', 99999);
$arr = explode(',', $str);
$out = implode(',', $arr, true);
Dear Igor,
that's a really great feature!
A question: hasn't the new one got a better performance ?
Thanks,
(c) Kenan Sulayman
Freelance Designer and Programmer
Life's Live Poetry
http://MyJurnal.tk/
Hello Kenan,
Thank you for your feedback.
A question: hasn't the new one got a better performance ?
According to my tests I would say they perform are equal. Please let
me know if you did any deeper tests.
PS: on string.c where one read :
implstr.len--;
should be:
implstr.len -= Z_STRLEN_P(delim);
Best Regards,
Igor Feghali.
Dear Igor,
in local tests, here on my server, the performance was,
every time I did it, at least 0.02 Second up to 0.1 Second faster than the
origin.
I'd appreciate, if this feature got included into the next build.
Thanks,
(c) Kenan Sulayman
Freelance Designer and Programmer
Life's Live Poetry
http://MyJurnal.tk/
I'd appreciate it*
Please see attached a new patch that fixes the problem with delimiters
bigger than one char (that's what happens when you code after
midnight). Also, as suggested by Hannes, skip_empty is now being
parsed as zend boolean (b) instead of zval (Z).
Regards,
Igor Feghali.
Hi Igor,
thanks for the patch, I think it is an idea worth considering. Except
that the proposed solution with executing array_filter()
might be
enough. You ran into a situation where this became a bottleneck in your
code?
Am Samstag, den 06.12.2008, 00:21 -0200 schrieb Igor Feghali:
Hello There,
I would like to propose a third argument to
implode()
, named
skip_empty, that will cause empty elements to be ignored when
generating the implode string. By empty I mean everything that
converts to an empty string such as '', false, null, etc.
For consistency, if we decide to introduce that for implode()
, we should
do it for explode()
too.
php -r "var_dump(explode(',', 'foo,,bla'));"
array(3) {
[0]=>
string(3) "foo"
[1]=>
string(0) ""
[2]=>
string(3) "bla"
}
and
php -r "var_dump(explode(',', 'foo,,bla', true));"
array(2) {
[0]=>
string(3) "foo"
[1]=>
string(3) "bla"
}
Anyway, would you mind writing a proper RFC for the addition?
cu, Lars
Jabber: lars@strojny.net
Weblog: http://usrportage.de
enough. You ran into a situation where this became a bottleneck in your
code?
nope. I just think its something too "simple", too "obvious" and too
"frequent" so it deserves the love of php internals. Having to call
array_filter()
and yet a callback function just to check the emptiness
of an element is too much of a hassle and a waste of both - running
and development time.
For consistency, if we decide to introduce that for
implode()
, we should
do it forexplode()
too.
Actually that's already on my plans but I was waiting some feedback on
implode()
first.
Anyway, would you mind writing a proper RFC for the addition?
Absolutely. I would appreciate some directions on that though. Can I
just follow the template of, say http://wiki.php.net/rfc/traits ?
Regards,
Igor Feghali.
Hi Igor,
Am Samstag, den 06.12.2008, 22:38 -0200 schrieb Igor Feghali:
[...]
nope. I just think its something too "simple", too "obvious" and too
"frequent" so it deserves the love of php internals. Having to call
array_filter()
and yet a callback function just to check the emptiness
of an element is too much of a hassle and a waste of both - running
and development time.
Alright, than I would appreciate a few real world use cases.
[... Same for explode ...]
Actually that's already on my plans but I was waiting some feedback on
implode()
first.
Cool!
Anyway, would you mind writing a proper RFC for the addition?
Absolutely. I would appreciate some directions on that though. Can I
just follow the template of, say http://wiki.php.net/rfc/traits ?
I guess you meant "Absolutely not" ;-)
The procedure: register a wiki account and start using the "template"
from the traits proposal. We are pretty liberal with adding/removing
chapters, so adjust it as much as you need it to make your point clear.
cu, Lars
Jabber: lars@strojny.net
Weblog: http://usrportage.de
Alright, than I would appreciate a few real world use cases.
Ok one immediate use case I can think of:
http://pear.php.net/package/Numbers_Words
In brazilian portuguese we write numbers like that:
1023 => mil e vinte e tres
123 => cento e vinte e tres
23 => vinte e tres
20 => vinte
So we can have an array that binds '3' to 'tres', '2' as a dozen to
'vinte', zero to false or empty string, etc. Parse the numbers from
left to right adding each word as an array element. In the end we can
just implode everything using the delimiter ' e '. Using actual
implode for the number 1023 would result as 'mil e e vinte e tres'
instead of just 'mil e vinte e tres'.
Using array_filter before every implode()
would add a big visual
overhead. I don't want to get into deeper details but translating
numbers to pt_br words is a lot more complex. We actually have to
parse, evaluate and implode()
every "chunk" of three numbers first,
then add all the chunks to a new array and finally implode()
everything together because there are different rules for inter-chunks
"relationship" and intra-chunks relationship.
I am not saying the use of array_filter is a bottleneck. But I am
sure there are a lot of heavy applications out there that runs a lot
happier with any bit of performance gain. Also less code means happier
programmers. Example:
function myfunc($var) {
// logic to test if var is not null, not false, not empty string etc.
}
// NOTE: leaving the callback function empty is not desired. 0 and 0.0
should not be skipped in the implode array !
array_filter($arr, 'myfunc');
implode('foo', $arr);
against:
implode('foo', $arr, true);
Again this is just one single example. I had to deal with this
situation a lot of times before.
I guess you meant "Absolutely not" ;-)
Absolutely (now I think its used correctly...)
The procedure: register a wiki account and start using the "template"
from the traits proposal. We are pretty liberal with adding/removing
chapters, so adjust it as much as you need it to make your point clear.
Right.
Best Regards,
Igor Feghali.
Igor Feghali wrote:
// NOTE: leaving the callback function empty is not desired. 0 and 0.0
should not be skipped in the implode array !
array_filter($arr, 'myfunc');
implode('foo', $arr);against:
implode('foo', $arr, true);
implode('foo', array_filter($arr, 'myfunc'));
Simple enough and generic.
Not worth adding IMHO.
- Chris