Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:43617
Mailing-List: contact internals-help@lists.php.net; run by ezmlm
Received-SPF: pass (pb1.pair.com: domain zend.com designates 212.25.124.163 as permitted sender)
Message-ID: <49D65752.6020207@zend.com>
Date: Fri, 03 Apr 2009 22:37:06 +0400
User-Agent: Thunderbird 2.0.0.21 (X11/20090320)
MIME-Version: 1.0
To: Matt Wilmas <php_lists@realplain.com>, Stas Malyshev <stas@zend.com>
CC: internals@lists.php.net, Lukas Kahwe Smith <mls@pooteeweet.org>, 
 =?ISO-8859-1?Q?Johannes_Schl=FCter?= <johannes@php.net>
References: <EB466381915442CFBBA6D6C9BCCB5392@pc1> <49D468CA.5050200@zend.com> <F29D36280ED8458AA3E2B3FA6BA94164@pc1>
In-Reply-To: <F29D36280ED8458AA3E2B3FA6BA94164@pc1>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [PATCH] double to long conversion change
From: dmitry@zend.com (Dmitry Stogov)

Hi Matt,

I don't really see why we should "preserve the least significant bits" 
and I don't think we should support bitwise operations with doubles.

Stas, could you please look into this too.

Thanks. Dmitry.

Matt Wilmas wrote:
> Hi Dmitry,
> 
> ----- Original Message -----
> From: "Dmitry Stogov"
> Sent: Thursday, April 02, 2009
> 
>> Hi Matt,
>>
>> I tried to look into this issue once again, but I completely 
>> misunderstand why do we need all this magic. Why do we need conversion 
>> of positive double into negative long?
> 
> I don't really have any more information than what has been given in my 
> various earlier messages that I've referenced. :-)  But it's no problem! 
> It's probably too much to keep track of, or try to find which message I 
> said something in, I know (I have to do that myself to refresh memory 
> about some parts).  So feel free to ask for an explanation about 
> anything. :-)
> 
> OK, regarding conversion of postive double into negative long (or it 
> could be positive if it "rolls over" again above ULONG_MAX, etc...): 1) 
> for me, the original issue I noticed, is preserving the least 
> significant bits when using bitwise AND on a large number (old ref 
> again: [1]).  Although now I know the 5.2 behavior I was getting can't 
> be relied on (<= ULONG_MAX it's probably OK however), that's what I'm 
> trying to do -- make conversions consistent and reliable. And 2) 
> unsigned specifiers in sprintf() (%u, %x, etc.) rely on this conversion 
> (though it currently *won't work* in 5.3 on 64-bit non-Windows).  See 
> references in Bugs #30695 and #42868.
> 
> [1] http://marc.info/?l=php-internals&m=120799720922202&w=2
> 
> The magic (different methods...?) is needed depending on what type of 
> conversion works on a platform.  BTW, I wasn't satisfied with what I 
> ended up with for my patch (unsure about how some things would behave, 
> some guessing), so a few days ago I started to try coming up with 
> something more complete and precise depending on what works on a 
> platform.  Not done yet, and will need to add some configure checks, 
> etc. (new for me).
> 
>> I would stay with single DVAL_TO_LVAL() definition and use it in 
>> places instead of (long)Z_DVAL().
> 
> That (single DVAL_TO_LVAL()) is basically what 5.2 has now until you 
> added more definitions (from Zoe) ;-) (which behave differently [2]) for 
> 5.3 in Nov '07 for Bug #42868.
> 
> [2] http://marc.info/?l=php-internals&m=123496364812725&w=2
> 
>> #define DVAL_TO_LVAL(d, l) \
>> if ((d) > LONG_MAX) { \
>> (l) = LONG_MAX; \
>> } else if ((d) <  LONG_MIN) { \
>> (l) = LONG_MIN; \
>> } else {\
>> (l) = (long) (d); \
>> }
> 
> That's close to 5.3's new version (depending which is used for a 
> platform), and *precisely* what was committed to zend_operators.c in Sep 
> '04 (v1.195 "Resolve undefined behavior (joe at redhat)" [3]).  After 
> Bug #30695, it was reverted in Nov: v1.203 "Revert Joe's work around a 
> bug in GCC patch as it breaks too many things." [4]
> 
> [3] 
> http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.194&r2=1.195&view=patch 
> 
> [4] 
> http://cvs.php.net/viewvc.cgi/ZendEngine2/zend_operators.c?r1=1.202&r2=1.203&view=patch 
> 
> 
> 
> - Matt
> 
> 
>> Or may be we need a second macro for conversion into unsigned long 
>> where it needed?
>>
>> #define DVAL_TO_ULONG(d, l) \
>> if ((d) > ULONG_MAX) { \
>> (l) = ULONG_MAX; \
>> } else if ((d) < 0) { \
>> (l) = 0; \
>> } else {\
>> (l) = (unsigned long) (d); \
>> }
>>
>> It also possible to add notices in case of overflow detection.
>>
>> Thanks. Dmitry.
>>
>> Matt Wilmas wrote:
>>> Hi all,
>>>
>>> Since noticing and reporting last year [1] different behavior when 
>>> casting out-of-range doubles to int after the DVAL_TO_LVAL() macro 
>>> was updated, I've wondered how to get the behavior I observed, and 
>>> thought could be relied on (that was wrong to think, since it was un- 
>>> or implementation-defined), back. And how to do so (what should be 
>>> expected?), while keeping in mind the reason for the change: 
>>> consistent behavior for tests. [2]  Except that the current code does 
>>> not give consistent results, depending on which DVAL_TO_LVAL 
>>> definition is used on a platform. [3]
>>>
>>> [1] http://marc.info/?l=php-internals&m=120799720922202&w=2
>>> [2] http://marc.info/?l=php-internals&m=123495655802226&w=2
>>> [3] http://marc.info/?l=php-internals&m=123496364812725&w=2
>>>
>>> So after I finally started to test my ideas for "consistent/reliable 
>>> overflow across platforms" a few days ago, I noticed that my 
>>> workaround technique quit working (0 instead of overflow) with 
>>> doubles over 2^63, without resorting to fmod().  That's on Windows, 
>>> but I suspect the same may happen on other systems that are limited 
>>> to 64-bit integer processing internally or something (32-bit 
>>> platforms?).  On 64-bit Linux anyway, it looks like doubles > 2^63 do 
>>> rollover as expected (128-bit "internal processing?"): 
>>> http://marc.info/?l=php-internals&m=123376495021789&w=2
>>>
>>> I wasn't sure how to rethink things after that...  But of course with 
>>> doubles, precision has been lost long before 2^63 anyway, as far as 
>>> increments of 1 (it's 1024 at 2^63).
>>>
>>> What I wound up with for now, is using 5.2's method on 64-bit 
>>> platforms, and on 32-bit, overflow behavior should be reliable up to 
>>> 2^63 on platforms that have zend_long64 type available (long long, 
>>> __int64), which I'm guessing is most (?), because of the unsigned 
>>> long involvement. Finally a fallback workaround for 32-bit platforms 
>>> without a 64-bit type.
>>>
>>> I updated a few other places in the code where only a (long) cast was 
>>> used. And sort of unrelated, but I added an 'L' conversion specifier 
>>> for zend_parse_parameters() in case it would be useful for PHP 
>>> functions that want to limit values to LONG_MAX/LONG_MIN, without 
>>> overflow, which I thought the DVAL_TO_LVAL change was *trying* to do.
>>>
>>> http://realplain.com/php/dval_to_lval.diff
>>> http://realplain.com/php/dval_to_lval_5_3.diff
>>>
>>> And here is an initial version of zend_dval_to_lval() (before 2^63 
>>> issue and thinking of zend_long64 + unsigned long), where some 
>>> configure checks would set ZEND_DVAL_TO_LVAL_USE_* as needed.
>>>
>>> http://realplain.com/php/dval_to_lval.txt
>>>
>>>
>>> Any general feedback, comments, questions, suggestions?  Hoping these 
>>> conversion issues could be sorted out for good in a "nice," logical 
>>> way. :-) Unfortunately on Windows, I'm just guessing, rather than 
>>> testing, conversion results in different environments...
>>>
>>>
>>> Thanks,
>>> Matt 
>