Hi internals!
I'd like to change our double-to-string casting behavior to be
locale-independent and would appreciate some opinions as to whether you
consider this feasible.
So, first off, this is how PHP currently behaves:
<?php setlocale(LC_ALL, 'de_DE');
var_dump((string) 3.14);
// string(4) "3,14"
The de_DE locale uses "," as the decimal separator (rather than ".") and
PHP makes use of this information when casting floating point numbers to
string.
That may seem like a nice feature, but practically it causes a lot of
issues: While PHP has no problem using "," when outputting floats, nothing
(including PHP itself) actually accepts that format.
E.g. if you have a floating point number and cast it to a string you will
NOT be able to cast the string back to a float, because PHP can't handle
the comma. This breaks PHP's usual paradigm of "numeric strings should
behave the same way as floats/ints".
<?php
$float = 3.14;
$string = (string) $float;
$newFloat = (float) $string;
var_dump($newFloat);
// double(3)
// WTF???
But this issue is not specific to PHP's own (float) cast. Practically no
protocols, APIs, etc accept floating point numbers with a comma.
Some examples:
-
If you create a MySQL query and put in a double value like so:
$query = "INSERT INTO ... VALUES ($double)";
// assume that $double is guaranteed to be a double hereI think the assumption the vast majority of developers would have here
is that the above code works correctly and is secure (under the assumption
that $double really is a double and you verified that). But that's not
true. With a comma-locale like de_DE this will output the double with a
comma, so you'll end up with something like this:"INSERT INTO ... VALUES (3.14)" // normal locale and expected behavior
"INSERT INTO ... VALUES (3,14)" // comma-locale and unexpected behaviorNot only does a change in locale break the code, it actually completely
changes semantics (a tuple with one floating point value becomes a tuple
with two integer values). -
The example that brought this issue to my attention again today is that
our own BCMath extension break down when you use it with floating point
values and a comma-locale (https://bugs.php.net/bug.php?id=55160). -
Another case where things can seriously go wrong is outputting doubles
in the generation of code (be it PHP for caching purposes or JS for the
client). To get around the issue you usually need to introduce some very
ugly code that changes theLC_NUMERIC
locale to 'C'. E.g. this is what Twig
uses in its code generator:if (false !== $locale = setlocale(LC_NUMERIC, 0)) { setlocale(LC_NUMERIC, 'C'); } $this->raw($value); if (false !== $locale) { setlocale(LC_NUMERIC, $locale); }
In this case (just like with MySQL) you will also not just emit wrong
code, but it can end up being working code with totally different semantics
(as "," is usually a function argument separator).
These are just three random examples I came up with, but I've seen this
issue a lot of times. The insidious thing about it is that, with very high
probability, you will not notice this issue during development (because you
don't use locales), it will only turn up later.
So, my suggestion is to change the (string) cast to always use "." as the
decimal separator, independent of locale. The patch for this is very
simple, just need to change a few occurrences of "%.*G" to "%.*H".
I think not having the locale-dependent output won't be much of a loss for
anyone, because if you need to actually localize the output of your
numbers, it is very likely that just replacing the decimal separator is not
enough (you will at least want to have a thousands-separator as well, i.e.
you want to use number_format).
So, thoughts?
Nikita
(Sorry for the long rant)
Hi internals!
I'd like to change our double-to-string casting behavior to be
locale-independent and would appreciate some opinions as to whether you
consider this feasible.
So, my suggestion is to change the (string) cast to always use "." as the
decimal separator, independent of locale. The patch for this is very
simple, just need to change a few occurrences of "%.*G" to "%.*H".
I'd like to see float/double casts recognize the locale's decimal
separator. It's perfectly fine in Oracle DB for numbers to be
inserted/fetched with "," (or any other character) as the decimal
separator:
<?php
$c = oci_connect('hr', 'welcome', 'localhost/XE');
$s = oci_parse($c, "alter session set nls_territory = germany");
oci_execute($s);
$s = oci_parse($c, "select 123.567 as num from dual");
oci_execute($s);
$r = oci_fetch_array($s, OCI_ASSOC);
$n1 = $r['NUM']; // value as fetched
var_dump($n1);
setlocale(LC_ALL, 'de_DE'); // this has no effect on casting to float
$n2 = (float)$n1; // now cast it to a number
var_dump($n2);
?>
The output is:
string(7) "123,567"
float(123) // Ideally this would be 123,567
Chris
--
christopher.jones@oracle.com http://twitter.com/ghrd
Free PHP & Oracle book:
http://www.oracle.com/technetwork/topics/php/underground-php-oracle-manual-098250.html
I'd like to change our double-to-string casting behavior to be
locale-independent and would appreciate some opinions as to whether you
consider this feasible.I'd like to see float/double casts recognize the locale's decimal
separator.
That's an interesting idea, and arguably one that's more in line with
what PHP has been doing.
I'd be really interested to hear from people in countries where the
decimal separator is a comma, since I don't have any experience with
this myself as an Anglophone — do you run PHP in your native locale,
and if so, would it be better to always have dots, as Nikita suggests,
or support parsing numbers with commas? (Or some combination therein.)
Adam
Am 02.10.2013 20:38, schrieb Adam Harvey:
I'd like to change our double-to-string casting behavior to be
locale-independent and would appreciate some opinions as to whether you
consider this feasible.I'd like to see float/double casts recognize the locale's decimal
separator.That's an interesting idea, and arguably one that's more in line with
what PHP has been doing.I'd be really interested to hear from people in countries where the
decimal separator is a comma, since I don't have any experience with
this myself as an Anglophone — do you run PHP in your native locale,
and if so, would it be better to always have dots, as Nikita suggests,
or support parsing numbers with commas? (Or some combination therein.)
+1
This is an issue I often ran into.
In my opinion on type casting a value from/to string it should use the
standard computer format and not a localized one. To format to a
localized format we have a function named "number_format" and since PHP
5.3 the class "NumberFormatter".
Additionally "setlocale" is a process operation that makes issues on
multi threaded envs. So temporary reset the locale isn't same, too.
My little two cent from germany
Marc
On Wed, Oct 2, 2013 at 7:57 PM, Christopher Jones <
christopher.jones@oracle.com> wrote:
I'd like to see float/double casts recognize the locale's decimal
separator. It's perfectly fine in Oracle DB for numbers to be
inserted/fetched with "," (or any other character) as the decimal
separator:
That will work fine for the specific case of doing a (float) cast, but it
will not solve the problem in general. Oracle specifically may not have a
problem with ","-numbers, but practically everything else does :/
Nikita