Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:43599
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: error (pb1.pair.com: domain realplain.com from 209.151.69.1 cause and error)
Message-ID: <F29D36280ED8458AA3E2B3FA6BA94164@pc1>
To: <internals@lists.php.net>,
	"Dmitry Stogov" <dmitry@zend.com>
Cc: "Lukas Kahwe Smith" <mls@pooteeweet.org>,
	=?iso-8859-1?Q?Johannes_Schl=FCter?= <johannes@php.net>
References: <EB466381915442CFBBA6D6C9BCCB5392@pc1> <49D468CA.5050200@zend.com>
Date: Thu, 2 Apr 2009 10:50:44 -0500
MIME-Version: 1.0
Content-Type: text/plain;
	format=flowed;
	charset="iso-8859-1";
	reply-type=response
Content-Transfer-Encoding: 7bit
Subject: Re: [PATCH] double to long conversion change
From: php_lists@realplain.com ("Matt Wilmas")

Hi Dmitry,

----- Original Message -----
From: "Dmitry Stogov"
Sent: Thursday, April 02, 2009

> Hi Matt,
>
> I tried to look into this issue once again, but I completely misunderstand 
> why do we need all this magic. Why do we need conversion of positive 
> double into negative long?

I don't really have any more information than what has been given in my 
various earlier messages that I've referenced. :-)  But it's no problem! 
It's probably too much to keep track of, or try to find which message I said 
something in, I know (I have to do that myself to refresh memory about some 
parts).  So feel free to ask for an explanation about anything. :-)

OK, regarding conversion of postive double into negative long (or it could 
be positive if it "rolls over" again above ULONG_MAX, etc...): 1) for me, 
the original issue I noticed, is preserving the least significant bits when 
using bitwise AND on a large number (old ref again: [1]).  Although now I 
know the 5.2 behavior I was getting can't be relied on (<= ULONG_MAX it's 
probably OK however), that's what I'm trying to do -- make conversions 
consistent and reliable. And 2) unsigned specifiers in sprintf() (%u, %x, 
etc.) rely on this conversion (though it currently *won't work* in 5.3 on 
64-bit non-Windows).  See references in Bugs #30695 and #42868.

[1] http://marc.info/?l=php-internals&m=120799720922202&w=2

The magic (different methods...?) is needed depending on what type of 
conversion works on a platform.  BTW, I wasn't satisfied with what I ended 
up with for my patch (unsure about how some things would behave, some 
guessing), so a few days ago I started to try coming up with something more 
complete and precise depending on what works on a platform.  Not done yet, 
and will need to add some configure checks, etc. (new for me).

> I would stay with single DVAL_TO_LVAL() definition and use it in places 
> instead of (long)Z_DVAL().

That (single DVAL_TO_LVAL()) is basically what 5.2 has now until you added 
more definitions (from Zoe) ;-) (which behave differently [2]) for 5.3 in 
Nov '07 for Bug #42868.

[2] http://marc.info/?l=php-internals&m=123496364812725&w=2

> #define DVAL_TO_LVAL(d, l) \
> if ((d) > LONG_MAX) { \
> (l) = LONG_MAX; \
> } else if ((d) <  LONG_MIN) { \
> (l) = LONG_MIN; \
> } else {\
> (l) = (long) (d); \
> }

That's close to 5.3's new version (depending which is used for a platform), 
and *precisely* what was committed to zend_operators.c in Sep '04 (v1.195 
"Resolve undefined behavior (joe at redhat)" [3]).  After Bug #30695, it was 
reverted in Nov: v1.203 "Revert Joe's work around a bug in GCC patch as it 
breaks too many things." [4]

[3] 
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.194&r2=1.195&view=patch
[4] 
http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.202&r2=1.203&view=patch


- Matt


> Or may be we need a second macro for conversion into unsigned long where 
> it needed?
>
> #define DVAL_TO_ULONG(d, l) \
> if ((d) > ULONG_MAX) { \
> (l) = ULONG_MAX; \
> } else if ((d) < 0) { \
> (l) = 0; \
> } else {\
> (l) = (unsigned long) (d); \
> }
>
> It also possible to add notices in case of overflow detection.
>
> Thanks. Dmitry.
>
> Matt Wilmas wrote:
>> Hi all,
>>
>> Since noticing and reporting last year [1] different behavior when 
>> casting out-of-range doubles to int after the DVAL_TO_LVAL() macro was 
>> updated, I've wondered how to get the behavior I observed, and thought 
>> could be relied on (that was wrong to think, since it was un- or 
>> implementation-defined), back. And how to do so (what should be 
>> expected?), while keeping in mind the reason for the change: consistent 
>> behavior for tests. [2]  Except that the current code does not give 
>> consistent results, depending on which DVAL_TO_LVAL definition is used on 
>> a platform. [3]
>>
>> [1] http://marc.info/?l=php-internals&m=120799720922202&w=2
>> [2] http://marc.info/?l=php-internals&m=123495655802226&w=2
>> [3] http://marc.info/?l=php-internals&m=123496364812725&w=2
>>
>> So after I finally started to test my ideas for "consistent/reliable 
>> overflow across platforms" a few days ago, I noticed that my workaround 
>> technique quit working (0 instead of overflow) with doubles over 2^63, 
>> without resorting to fmod().  That's on Windows, but I suspect the same 
>> may happen on other systems that are limited to 64-bit integer processing 
>> internally or something (32-bit platforms?).  On 64-bit Linux anyway, it 
>> looks like doubles > 2^63 do rollover as expected (128-bit "internal 
>> processing?"): http://marc.info/?l=php-internals&m=123376495021789&w=2
>>
>> I wasn't sure how to rethink things after that...  But of course with 
>> doubles, precision has been lost long before 2^63 anyway, as far as 
>> increments of 1 (it's 1024 at 2^63).
>>
>> What I wound up with for now, is using 5.2's method on 64-bit platforms, 
>> and on 32-bit, overflow behavior should be reliable up to 2^63 on 
>> platforms that have zend_long64 type available (long long, __int64), 
>> which I'm guessing is most (?), because of the unsigned long involvement. 
>> Finally a fallback workaround for 32-bit platforms without a 64-bit type.
>>
>> I updated a few other places in the code where only a (long) cast was 
>> used. And sort of unrelated, but I added an 'L' conversion specifier for 
>> zend_parse_parameters() in case it would be useful for PHP functions that 
>> want to limit values to LONG_MAX/LONG_MIN, without overflow, which I 
>> thought the DVAL_TO_LVAL change was *trying* to do.
>>
>> http://realplain.com/php/dval_to_lval.diff
>> http://realplain.com/php/dval_to_lval_5_3.diff
>>
>> And here is an initial version of zend_dval_to_lval() (before 2^63 issue 
>> and thinking of zend_long64 + unsigned long), where some configure checks 
>> would set ZEND_DVAL_TO_LVAL_USE_* as needed.
>>
>> http://realplain.com/php/dval_to_lval.txt
>>
>>
>> Any general feedback, comments, questions, suggestions?  Hoping these 
>> conversion issues could be sorted out for good in a "nice," logical way. 
>> :-) Unfortunately on Windows, I'm just guessing, rather than testing, 
>> conversion results in different environments...
>>
>>
>> Thanks,
>> Matt