Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing now
Jakub's work on catching overflows in ext/openssl and Matt Williams
suggestions on it (which was going a bit more global over it). So I came up
with these macros with two goals
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
- simplify the checks where external libs (openssl, libxml, etc.) use firm
datatypes like int
#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX) # define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif
#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}
Compiler would eliminate the branch automatically on 32-bit and ILP64.
Some other macros to do signed/unsigned comparison, these can be extended.
#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) >
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong) ((zlong) < 0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size, zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))
IMHO these and maybe more are missing after the 64-bit RFC. Do you think
they would make sense? Or would make sense now, or later in master?
Thanks
Anatol
Maybe I'm missing something here, but how do these macros detect overflow
exactly? If the check is done on the actual result and not the operands
then it's not a good overflow check. Additionally, why wouldn't overflow
checks be needed on 32-bit architecture, or any other architecture for that
matter? Integers can overflow there too.
On Fri, Aug 21, 2015 at 4:41 AM, Anatol Belski anatol.php@belski.net
wrote:
Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing now
Jakub's work on catching overflows in ext/openssl and Matt Williams
suggestions on it (which was going a bit more global over it). So I came up
with these macros with two goals
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
- simplify the checks where external libs (openssl, libxml, etc.) use firm
datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX) # define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif
#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}Compiler would eliminate the branch automatically on 32-bit and ILP64.
Some other macros to do signed/unsigned comparison, these can be extended.
#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) >
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong) ((zlong) <
0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size, zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))IMHO these and maybe more are missing after the 64-bit RFC. Do you think
they would make sense? Or would make sense now, or later in master?Thanks
Anatol
Hi Sherif,
-----Original Message-----
From: Sherif Ramadan [mailto:theanomaly.is@gmail.com]
Sent: Friday, August 21, 2015 11:21 AM
To: Anatol Belski anatol.php@belski.net
Cc: Dmitry Stogov dmitry@php.net; Xinchen Hui xinchen.h@zend.com;
Nikita Popov nikita.ppv@gmail.com; Pierre Joye pierre.php@gmail.com;
Bob Weinand bobwei9@hotmail.com; Jakub Zelenka bukka@php.net; Matt
Wilmas php_lists@realplain.com; PHP Internals internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonMaybe I'm missing something here, but how do these macros detect overflow
exactly? If the check is done on the actual result and not the operands then it's
not a good overflow check. Additionally, why wouldn't overflow checks be
needed on 32-bit architecture, or any other architecture for that matter?
Integers can overflow there too.
Example code in simplexml_load_string()
if (zend_parse_parameters(ZEND_NUM_ARGS(), "s|C!lsb", &data, &data_len, &ce, &options, &ns, &ns_len, &isprefix) == FAILURE) {
return;
}
If (ZEND_LONG_INT_OVFL(options)) {
RETURN_FALSE;
}
If (ZEND_SIZE_T_INT_OVFL(data_len)) {
RETURN_FALSE;
}
docp = xmlReadMemory(data, data_len, NULL, NULL, options);
- on x86_64 - possible int overflow without check
- on ILP64 or i386 alike - no int overflow per se, so can be ommited
Regards
Anatol
On Fri, Aug 21, 2015 at 4:41 AM, Anatol Belski anatol.php@belski.net
wrote:Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing
now Jakub's work on catching overflows in ext/openssl and Matt
Williams suggestions on it (which was going a bit more global over
it). So I came up with these macros with two goals
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
- simplify the checks where external libs (openssl, libxml, etc.) use
firm datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX)
define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}Compiler would eliminate the branch automatically on 32-bit and ILP64.
Some other macros to do signed/unsigned comparison, these can be
extended.#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) >
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong)
((zlong) <
0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size,
|| zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))IMHO these and maybe more are missing after the 64-bit RFC. Do you
think they would make sense? Or would make sense now, or later in master?Thanks
Anatol
--
To unsubscribe,
visit: http://www.php.net/unsub.php
-----Original Message-----
From: Anatol Belski [mailto:anatol.php@belski.net]
Sent: Friday, August 21, 2015 11:38 AM
To: 'Sherif Ramadan' theanomaly.is@gmail.com
Cc: 'Dmitry Stogov' dmitry@php.net; 'Xinchen Hui' xinchen.h@zend.com;
'Nikita Popov' nikita.ppv@gmail.com; 'Pierre Joye' pierre.php@gmail.com;
'Bob Weinand' bobwei9@hotmail.com; 'Jakub Zelenka' bukka@php.net;
'Matt Wilmas' php_lists@realplain.com; 'PHP Internals'
internals@lists.php.net
Subject: RE: [PHP-DEV] Overflow checks and integral vars comparisonHi Sherif,
-----Original Message-----
From: Sherif Ramadan [mailto:theanomaly.is@gmail.com]
Sent: Friday, August 21, 2015 11:21 AM
To: Anatol Belski anatol.php@belski.net
Cc: Dmitry Stogov dmitry@php.net; Xinchen Hui xinchen.h@zend.com;
Nikita Popov nikita.ppv@gmail.com; Pierre Joye
pierre.php@gmail.com; Bob Weinand bobwei9@hotmail.com; Jakub
Zelenka bukka@php.net; Matt Wilmas php_lists@realplain.com; PHP
Internals internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonMaybe I'm missing something here, but how do these macros detect
overflow exactly? If the check is done on the actual result and not
the operands then it's not a good overflow check. Additionally, why
wouldn't overflow checks be needed on 32-bit architecture, or any other
architecture for that matter?
Integers can overflow there too.Example code in
simplexml_load_string()
if (zend_parse_parameters(ZEND_NUM_ARGS(), "s|C!lsb", &data, &data_len,
&ce, &options, &ns, &ns_len, &isprefix) == FAILURE) {
return;
}If (ZEND_LONG_INT_OVFL(options)) { RETURN_FALSE; } If (ZEND_SIZE_T_INT_OVFL(data_len)) { RETURN_FALSE; } docp = xmlReadMemory(data, data_len, NULL, NULL, options);
- on x86_64 - possible int overflow without check
- on ILP64 or i386 alike - no int overflow per se, so can be ommited
No int overflow with zend_long I wanted to say, so it's "if (0)" which is eliminated, but with size_t an overflow is still possible.
On Fri, Aug 21, 2015 at 4:41 AM, Anatol Belski anatol.php@belski.net
wrote:Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing
now Jakub's work on catching overflows in ext/openssl and Matt
Williams suggestions on it (which was going a bit more global over
it). So I came up with these macros with two goals
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
- simplify the checks where external libs (openssl, libxml, etc.)
use firm datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX)
define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}Compiler would eliminate the branch automatically on 32-bit and ILP64.
Some other macros to do signed/unsigned comparison, these can be
extended.#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size)
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong)
((zlong) <
0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size,
|| zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))IMHO these and maybe more are missing after the 64-bit RFC. Do you
think they would make sense? Or would make sense now, or later in master?Thanks
Anatol
--
To unsubscribe,
visit: http://www.php.net/unsub.php--
To unsubscribe, visit:
http://www.php.net/unsub.php
I think you're a little optimistic about how effective these macros would
be for overflow checks. Also, if we're talking ANSI C or C99, then size_t
is always unsigned, and as far as I know GCC 2.4 always treats it as such.
If we're trying to stick to C here anyway.
As far as architecture specific stuff I would much rather rely on using the
built-in GCC overflow checks here
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
... as they are much safer and likely going to be far more performant than
doing all these casts everywhere. Not to mention the fact that you can
actually catch the overflow at the actual arithmetic level, where it's
safe, and hopefully be able to rely on the ISA's overflow or carry bits. If
we're trying to detect overflows or wraps after the fact, you don't add
much in the way of security. For example, I'm not at all sure how (zlong) <
(zend_long)INT_MIN will ever detect an overflow.
On Fri, Aug 21, 2015 at 5:40 AM, Anatol Belski anatol.php@belski.net
wrote:
-----Original Message-----
From: Anatol Belski [mailto:anatol.php@belski.net]
Sent: Friday, August 21, 2015 11:38 AM
To: 'Sherif Ramadan' theanomaly.is@gmail.com
Cc: 'Dmitry Stogov' dmitry@php.net; 'Xinchen Hui' <xinchen.h@zend.com
;
'Nikita Popov' nikita.ppv@gmail.com; 'Pierre Joye' <
pierre.php@gmail.com>;
'Bob Weinand' bobwei9@hotmail.com; 'Jakub Zelenka' bukka@php.net;
'Matt Wilmas' php_lists@realplain.com; 'PHP Internals'
internals@lists.php.net
Subject: RE: [PHP-DEV] Overflow checks and integral vars comparisonHi Sherif,
-----Original Message-----
From: Sherif Ramadan [mailto:theanomaly.is@gmail.com]
Sent: Friday, August 21, 2015 11:21 AM
To: Anatol Belski anatol.php@belski.net
Cc: Dmitry Stogov dmitry@php.net; Xinchen Hui xinchen.h@zend.com;
Nikita Popov nikita.ppv@gmail.com; Pierre Joye
pierre.php@gmail.com; Bob Weinand bobwei9@hotmail.com; Jakub
Zelenka bukka@php.net; Matt Wilmas php_lists@realplain.com; PHP
Internals internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonMaybe I'm missing something here, but how do these macros detect
overflow exactly? If the check is done on the actual result and not
the operands then it's not a good overflow check. Additionally, why
wouldn't overflow checks be needed on 32-bit architecture, or any other
architecture for that matter?
Integers can overflow there too.Example code in
simplexml_load_string()
if (zend_parse_parameters(ZEND_NUM_ARGS(), "s|C!lsb", &data,
&data_len,
&ce, &options, &ns, &ns_len, &isprefix) == FAILURE) {
return;
}If (ZEND_LONG_INT_OVFL(options)) { RETURN_FALSE; } If (ZEND_SIZE_T_INT_OVFL(data_len)) { RETURN_FALSE; } docp = xmlReadMemory(data, data_len, NULL, NULL, options);
- on x86_64 - possible int overflow without check
- on ILP64 or i386 alike - no int overflow per se, so can be ommited
No int overflow with zend_long I wanted to say, so it's "if (0)" which is
eliminated, but with size_t an overflow is still possible.On Fri, Aug 21, 2015 at 4:41 AM, Anatol Belski anatol.php@belski.net
wrote:Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing
now Jakub's work on catching overflows in ext/openssl and Matt
Williams suggestions on it (which was going a bit more global over
it). So I came up with these macros with two goals
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
- simplify the checks where external libs (openssl, libxml, etc.)
use firm datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX)
define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}Compiler would eliminate the branch automatically on 32-bit and
ILP64.Some other macros to do signed/unsigned comparison, these can be
extended.#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size)
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong)
((zlong) <
0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size,
|| zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))IMHO these and maybe more are missing after the 64-bit RFC. Do you
think they would make sense? Or would make sense now, or later in
master?Thanks
Anatol
--
To unsubscribe,
visit: http://www.php.net/unsub.php--
To unsubscribe,
visit:
http://www.php.net/unsub.php
-----Original Message-----
From: Sherif Ramadan [mailto:theanomaly.is@gmail.com]
Sent: Friday, August 21, 2015 12:00 PM
To: Anatol Belski anatol.php@belski.net
Cc: Dmitry Stogov dmitry@php.net; Xinchen Hui xinchen.h@zend.com;
Nikita Popov nikita.ppv@gmail.com; Pierre Joye pierre.php@gmail.com;
Bob Weinand bobwei9@hotmail.com; Jakub Zelenka bukka@php.net; Matt
Wilmas php_lists@realplain.com; PHP Internals internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonI think you're a little optimistic about how effective these macros would be for
overflow checks. Also, if we're talking ANSI C or C99, then size_t is always
unsigned, and as far as I know GCC 2.4 always treats it as such.
If we're trying to stick to C here anyway.As far as architecture specific stuff I would much rather rely on using the built-in
GCC overflow checks here https://gcc.gnu.org/onlinedocs/gcc/Integer-
Overflow-Builtins.html
Yes, this is a good idea as a further extension of such mechanics. However you're talking about a different topic that I've proposed now.
... as they are much safer and likely going to be far more performant than doing
all these casts everywhere. Not to mention the fact that you can actually catch
the overflow at the actual arithmetic level, where it's safe, and hopefully be able
to rely on the ISA's overflow or carry bits. If we're trying to detect overflows or
wraps after the fact, you don't add much in the way of security. For example,
I'm not at all sure how (zlong) < (zend_long)INT_MIN will ever detect an
overflow.
What I'm talking about is detecting whether a variable of zend_long or size_t is in the safe range to be passed to a signature requiring int. This is quite a minimalistic start in this direction.
Also please remember that there is not only GCC in the world. It is quite another topic to implement overflow checks portable ways, some intrinsics can be here of some help, too. But rather than hitting quite a global thing, I'd rather start on what is simple and is for sure an issue at least with some dependency libs.
Regards
Anatol
On Fri, Aug 21, 2015 at 6:22 AM, Anatol Belski anatol.php@belski.net
wrote:
-----Original Message-----
From: Sherif Ramadan [mailto:theanomaly.is@gmail.com]
Sent: Friday, August 21, 2015 12:00 PM
To: Anatol Belski anatol.php@belski.net
Cc: Dmitry Stogov dmitry@php.net; Xinchen Hui xinchen.h@zend.com;
Nikita Popov nikita.ppv@gmail.com; Pierre Joye pierre.php@gmail.com;
Bob Weinand bobwei9@hotmail.com; Jakub Zelenka bukka@php.net; Matt
Wilmas php_lists@realplain.com; PHP Internals <internals@lists.php.netSubject: Re: [PHP-DEV] Overflow checks and integral vars comparison
I think you're a little optimistic about how effective these macros
would be for
overflow checks. Also, if we're talking ANSI C or C99, then size_t is
always
unsigned, and as far as I know GCC 2.4 always treats it as such.
If we're trying to stick to C here anyway.As far as architecture specific stuff I would much rather rely on using
the built-in
GCC overflow checks here https://gcc.gnu.org/onlinedocs/gcc/Integer-
Overflow-Builtins.htmlYes, this is a good idea as a further extension of such mechanics. However
you're talking about a different topic that I've proposed now.... as they are much safer and likely going to be far more performant
than doing
all these casts everywhere. Not to mention the fact that you can
actually catch
the overflow at the actual arithmetic level, where it's safe, and
hopefully be able
to rely on the ISA's overflow or carry bits. If we're trying to detect
overflows or
wraps after the fact, you don't add much in the way of security. For
example,
I'm not at all sure how (zlong) < (zend_long)INT_MIN will ever detect an
overflow.What I'm talking about is detecting whether a variable of zend_long or
size_t is in the safe range to be passed to a signature requiring int. This
is quite a minimalistic start in this direction.
I see. So you're not actually doing overflow checks then? Because at the
point you'd be checking this zend_long or size_t it could have already
overflowed or wrapped. The subject may have misled me to understand
differently.
Also please remember that there is not only GCC in the world. It is quite
another topic to implement overflow checks portable ways, some intrinsics
can be here of some help, too. But rather than hitting quite a global
thing, I'd rather start on what is simple and is for sure an issue at least
with some dependency libs.
Of course, but the ASM can also be ported to other architectures and
wrapped in #ifdef for non x86 and MSVC or other compilers, for example.
It's not impossible to achieve some sane degree of portability there. I see
Andrea already worked on some of this in zend_operators.h for example
https://github.com/php/php-src/blob/ee2e1691080dad2a3110107dd8bd02ee23b41fa0/Zend/zend_operators.h#L437
Of course, making integer overflow checks safe and efficient is by no means
easy on every given architecture, but surely we could aim to support at
least the broadest current architectures at first and fall back to C
overflow checks if necessary.
I agree with starting simple. I'm honestly not sure of the effectiveness of
what you're proposing to be honest, but maybe with a PR I'd be able to wrap
my head around it a little more.
Regards
Anatol
I see. So you're not actually doing overflow checks then? Because at the
point you'd be checking this zend_long or size_t it could have already
overflowed or wrapped. The subject may have misled me to understand
differently.
I think I understand the confusion: you are thinking of overflow as
something which happens within a type based on some operation
(addition, multiplication, etc).
Anatol is talking about overflows which occur when casting between
types: a value of 2^33 can safely be passed around as a 64-bit integer,
no overflow has occurred; but attempting to cast it to a 32-bit integer
will immediately overflow the 32-bit integer.
Since many PHP extensions are wrappers around libraries which may only
deal in 32-bit types, this cast is common, necessitating range checks
like the ones proposed.
Regards,
--
Rowan Collins
[IMSoP]
Hey Rowan,
Yup, I get it now. Sorry for the confusion.
I actually remember fixing a similar bug in pdo_sqlite a while back where
casting/translation between the two sizes caused such an issue. So I think
this would be pretty helpful.
On Sun, Aug 23, 2015 at 6:28 PM, Rowan Collins rowan.collins@gmail.com
wrote:
I see. So you're not actually doing overflow checks then? Because at the
point you'd be checking this zend_long or size_t it could have already
overflowed or wrapped. The subject may have misled me to understand
differently.I think I understand the confusion: you are thinking of overflow as
something which happens within a type based on some operation (addition,
multiplication, etc).Anatol is talking about overflows which occur when casting between
types: a value of 2^33 can safely be passed around as a 64-bit integer, no
overflow has occurred; but attempting to cast it to a 32-bit integer will
immediately overflow the 32-bit integer.Since many PHP extensions are wrappers around libraries which may only
deal in 32-bit types, this cast is common, necessitating range checks like
the ones proposed.Regards,
--
Rowan Collins
[IMSoP]
Hi Sherif,
-----Original Message-----
From: Sherif Ramadan [mailto:theanomaly.is@gmail.com]
Sent: Monday, August 24, 2015 4:06 PM
To: Rowan Collins rowan.collins@gmail.com
Cc: PHP Internals internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonHey Rowan,
Yup, I get it now. Sorry for the confusion.
I actually remember fixing a similar bug in pdo_sqlite a while back where
casting/translation between the two sizes caused such an issue. So I think this
would be pretty helpful.
Thanks for the evaluations. Now that it's pointed out another can of worms exists as well, that should be pretty much a follow up in the future.
Regards
Anatol
Hi Rowan,
-----Original Message-----
From: Rowan Collins [mailto:rowan.collins@gmail.com]
Sent: Monday, August 24, 2015 12:29 AM
To: internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonI see. So you're not actually doing overflow checks then? Because at
the point you'd be checking this zend_long or size_t it could have
already overflowed or wrapped. The subject may have misled me to
understand differently.I think I understand the confusion: you are thinking of overflow as something
which happens within a type based on some operation (addition,
multiplication, etc).Anatol is talking about overflows which occur when casting between
types: a value of 2^33 can safely be passed around as a 64-bit integer, no
overflow has occurred; but attempting to cast it to a 32-bit integer will
immediately overflow the 32-bit integer.Since many PHP extensions are wrappers around libraries which may only deal in
32-bit types, this cast is common, necessitating range checks like the ones
proposed.
Yep, that's the exact point. Not touching arithmetic operations as it is a complex thing. Even on 64-bit fe there are intrinsics for doing 128-bit math which could be used. ASM could be helpful ofc, however it needs much care, fe even within GCC versions. So it is big and should be another story :)
Overflows in the cast can happen when passing args, comparing variables of different type or signess, or even with the numeric constants. Hence that cast (zend_long)INT_MAX for example. At the end - it's still an overflow/underflow issue. I'm going for an RFC with this to target 7.1.
Thanks.
anatol
Hi Anatol,
I don't see any problem adding ZEND_LONG_INT_OVF and similar macros into
7.0.
Thanks. Dmitry.
On Mon, Aug 24, 2015 at 5:15 PM, Anatol Belski anatol.php@belski.net
wrote:
Hi Rowan,
-----Original Message-----
From: Rowan Collins [mailto:rowan.collins@gmail.com]
Sent: Monday, August 24, 2015 12:29 AM
To: internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonI see. So you're not actually doing overflow checks then? Because at
the point you'd be checking this zend_long or size_t it could have
already overflowed or wrapped. The subject may have misled me to
understand differently.I think I understand the confusion: you are thinking of overflow as
something
which happens within a type based on some operation (addition,
multiplication, etc).Anatol is talking about overflows which occur when casting between
types: a value of 2^33 can safely be passed around as a 64-bit integer,
no
overflow has occurred; but attempting to cast it to a 32-bit integer will
immediately overflow the 32-bit integer.Since many PHP extensions are wrappers around libraries which may only
deal in
32-bit types, this cast is common, necessitating range checks like the
ones
proposed.Yep, that's the exact point. Not touching arithmetic operations as it is a
complex thing. Even on 64-bit fe there are intrinsics for doing 128-bit
math which could be used. ASM could be helpful ofc, however it needs much
care, fe even within GCC versions. So it is big and should be another story
:)Overflows in the cast can happen when passing args, comparing variables of
different type or signess, or even with the numeric constants. Hence that
cast (zend_long)INT_MAX for example. At the end - it's still an
overflow/underflow issue. I'm going for an RFC with this to target 7.1.Thanks.
anatol
Hi Dmitry,
-----Original Message-----
From: Dmitry Stogov [mailto:dmitry@zend.com]
Sent: Tuesday, August 25, 2015 9:29 AM
To: Anatol Belski anatol.php@belski.net
Cc: Rowan Collins rowan.collins@gmail.com; PHP Internals
internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonHi Anatol,
I don't see any problem adding ZEND_LONG_INT_OVF and similar macros into
7.0.
Thanks for taking a look. I was doing a quick RFC draft yesterday also including a ZPP suggestion from Jakub. But given there's some demand on such checks still, probably better I withdraw the RFC and work towards macros for the range checks and integrating them at the appropriate places. Pure internal stuff anyway.
Thanks.
Anatol
Hi Anatol,
On Tue, Aug 25, 2015 at 12:41 PM, Anatol Belski anatol.php@belski.net
wrote:
Hi Dmitry,
-----Original Message-----
From: Dmitry Stogov [mailto:dmitry@zend.com]
Sent: Tuesday, August 25, 2015 9:29 AM
To: Anatol Belski anatol.php@belski.net
Cc: Rowan Collins rowan.collins@gmail.com; PHP Internals
internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonHi Anatol,
I don't see any problem adding ZEND_LONG_INT_OVF and similar macros into
7.0.Thanks for taking a look. I was doing a quick RFC draft yesterday also
including a ZPP suggestion from Jakub. But given there's some demand on
such checks still, probably better I withdraw the RFC and work towards
macros for the range checks and integrating them at the appropriate places.
Pure internal stuff anyway.
I think it makes sense to merge all macros without RFC to master for 7.0
and keep RFC just for ZPP changes as there might be some discussion about
the names (e.g. "r" is already used for resources ;) ). Or maybe just check
if everyone is ok with names (ZPP flags) and then integrate it as well. In
any case, it would be nice to have this in ZPP at some point.
Cheers
Jakub
Hi Jakub,
-----Original Message-----
From: jakub.php@gmail.com [mailto:jakub.php@gmail.com] On Behalf Of Jakub
Zelenka
Sent: Tuesday, August 25, 2015 2:46 PM
To: Anatol Belski anatol.php@belski.net
Cc: Dmitry Stogov dmitry@zend.com; Rowan Collins
rowan.collins@gmail.com; PHP Internals internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonHi Anatol,
On Tue, Aug 25, 2015 at 12:41 PM, Anatol Belski anatol.php@belski.net
wrote:Hi Dmitry,
-----Original Message-----
From: Dmitry Stogov [mailto:dmitry@zend.com]
Sent: Tuesday, August 25, 2015 9:29 AM
To: Anatol Belski anatol.php@belski.net
Cc: Rowan Collins rowan.collins@gmail.com; PHP Internals
internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonHi Anatol,
I don't see any problem adding ZEND_LONG_INT_OVF and similar macros
into 7.0.Thanks for taking a look. I was doing a quick RFC draft yesterday also
including a ZPP suggestion from Jakub. But given there's some demand
on such checks still, probably better I withdraw the RFC and work
towards macros for the range checks and integrating them at the appropriate
places.
Pure internal stuff anyway.I think it makes sense to merge all macros without RFC to master for 7.0 and
keep RFC just for ZPP changes as there might be some discussion about the
names (e.g. "r" is already used for resources ;) ). Or maybe just check if everyone
is ok with names (ZPP flags) and then integrate it as well. In any case, it would be
nice to have this in ZPP at some point.
Yep, was just a quick writing yesterday, overseen the 'r'. Actually pity as it could be like 'q', 'r,', 's' ... easy to memorize :)
With ZPP there could be also a little performance concern, as it would add four cases to parse to ZPP yet. So probably also depends on which scale it is needed. And another point is that an implementation with the fast ZPP were needed as well. So turns out as a bit more than it looked at the start, but would be IMHO useful from the functional sight.
I'd be working of the macros next days, to cover as much as possible then.
Regards
Anatol
Hi Sherif,
Sherif Ramadan wrote:
<minor off-topic note>Of course, but the ASM can also be ported to other architectures and
wrapped in #ifdef for non x86 and MSVC or other compilers, for example.
It's not impossible to achieve some sane degree of portability there. I see
Andrea already worked on some of this in zend_operators.h for example
https://github.com/php/php-src/blob/ee2e1691080dad2a3110107dd8bd02ee23b41fa0/Zend/zend_operators.h#L437Of course, making integer overflow checks safe and efficient is by no means
easy on every given architecture, but surely we could aim to support at
least the broadest current architectures at first and fall back to C
overflow checks if necessary.
The email I'm replying to here is more than a month old now.
Just to set the record straight, though, PHP's assembly and pure C
overflow checks on arithmetic operations aren't my work, they've been
there for a long time. If you were to git blame
them you might find
the original author eventually. The only overflow checks I added to PHP
were for float-to-int conversions, and that doesn't require inline asm.
One thing I did do, however, was rip out that inline assembly in my
bigint branch (which wasn't merged in the end, alas), since it was
out-of-date (didn't handle bigints) and I wouldn't dare mess with it.
Instead, I replaced it it with clang and GCC-compatible intrinsics,
which require no assembly knowledge to understand:
https://github.com/php/php-src/commit/bdcd9bbd29fac886192fbc410ccb96bd4cb93375
At some point, I might make a pull request to add this to PHP master, if
it provides a performance boost.
Thanks.
--
Andrea Faulds
http://ajf.me/
Hi Anatol,
On Fri, Aug 21, 2015 at 9:41 AM, Anatol Belski anatol.php@belski.net
wrote:
Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing now
Jakub's work on catching overflows in ext/openssl and Matt Williams
suggestions on it (which was going a bit more global over it). So I came up
with these macros with two goals
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
- simplify the checks where external libs (openssl, libxml, etc.) use firm
datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX) # define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif
#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}
I think this makes and sense and it's a good start.
What I think would be even more useful is to have such checks in ZPP. The
thing is that usual use case for libs where the most used integer type is
int (e.g. openssl) is following:
- Get zend_long "l" resp. string "s" with size_t len from
zend_parse_paramters - check if it doesn't overflow INT_MAX or when needed INT_MIN
- cast it to int and pass the value to the lib function
If we have such functionality in ZPP, then it would be much simpler. I
would imaging adding "i" for int and something like "q" (maybe different
letter :) ) for string with int size. It would be basically the same as
"l" and "s" but there would be an extra int overflow check which would fail
if it's bigger / smaller. That would be very useful IMHO.
Cheers
Jakub
Hi Jakub,
-----Original Message-----
From: jakub.php@gmail.com [mailto:jakub.php@gmail.com] On Behalf Of Jakub
Zelenka
Sent: Sunday, August 23, 2015 8:29 PM
To: Anatol Belski anatol.php@belski.net
Cc: Dmitry Stogov dmitry@php.net; Xinchen Hui xinchen.h@zend.com;
Nikita Popov nikita.ppv@gmail.com; Pierre Joye pierre.php@gmail.com;
Bob Weinand bobwei9@hotmail.com; Matt Wilmas
php_lists@realplain.com; PHP internals list internals@lists.php.net
Subject: [PHP-DEV] Re: Overflow checks and integral vars comparisonHi Anatol,
On Fri, Aug 21, 2015 at 9:41 AM, Anatol Belski anatol.php@belski.net
wrote:Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing
now Jakub's work on catching overflows in ext/openssl and Matt
Williams suggestions on it (which was going a bit more global over
it). So I came up with these macros with two goals
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
- simplify the checks where external libs (openssl, libxml, etc.) use
firm datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX)
define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}I think this makes and sense and it's a good start.
What I think would be even more useful is to have such checks in ZPP. The thing
is that usual use case for libs where the most used integer type is int (e.g.
openssl) is following:
- Get zend_long "l" resp. string "s" with size_t len from zend_parse_paramters
- check if it doesn't overflow INT_MAX or when needed INT_MIN 3. cast it to
int and pass the value to the lib functionIf we have such functionality in ZPP, then it would be much simpler. I would
imaging adding "i" for int and something like "q" (maybe different letter :) ) for
string with int size. It would be basically the same as "l" and "s" but there would
be an extra int overflow check which would fail if it's bigger / smaller. That
would be very useful IMHO.
Yeah, it is probably a good addition. If it is known that some API like libxml2 requires integer which is 32-bit on LLP64, LP64 or alike so it could overflow - ZPP could check it. It is useful in the case the exact value is passed directly to the before mentioned API. If some calculations, etc. need to be done before passing to that API, the macros can still be useful.
So maybe both a ZPP and macro change were useful and it would make sense to go for an RFC first. I've started a draft here https://wiki.php.net/rfc/range_checks_for_64_bit .
Regards
Anatol
Hi Anatol, Dmitry, all,
----- Original Message -----
From: "Anatol Belski"
Sent: Friday, August 21, 2015
Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing now
Jakub's work on catching overflows in ext/openssl and Matt Williams
suggestions on it (which was going a bit more global over it). So I came
up
with these macros with two goals
Dang, people be gettin' my name wrong, in the same way, even "in writing."
;-P
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
The compiler will already do that, as I've said of course, and more
importantly (for most cases), it will never get it wrong.
- simplify the checks where external libs (openssl, libxml, etc.) use firm
datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX) # define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif
Really no point in all that...
#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
What's wrong with just doing the check, "as is," instead of hiding it behind
a [longer] macro? (Also, none of the casts are necessary...?)
So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}Compiler would eliminate the branch automatically on 32-bit and ILP64.
Already will. :^)
Some other macros to do signed/unsigned comparison, these can be extended.
#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) >
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong) ((zlong) <
0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size,
zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))
Yes, the only case where we might want to do something special is comparing
an unsigned type, like size_t (string lengths), with a signed max, since
that's not impossible to the compiler. Of course, you have to KNOW that's
what you want and/or it's safe to ignore the check, e.g. not dealing with
strings > 2 GB on 32-bit, etc.
And again, a generic way to do that for a "max" of integer type or greater
(since literals are smallest type >= int that will hold the value), that
should work anywhere, is:
#define ZEND_SIGNED_OVFL_etc(val, max) (sizeof(val) > sizeof(max) && (val) >
(max))
Simple and optimizes easily...
IMHO these and maybe more are missing after the 64-bit RFC. Do you think
they would make sense? Or would make sense now, or later in master?
Also, why, exactly, has the zend_string length been defined as size_t (just
because it's "supposed to" be?), which is incompatible with most everything
in PHP, instead of zend_long (since we had int before)?
e.g. Just look at strlen()
-- oops, size_t no worky! :-/ Dmitry? We need
range checking there, right? Conversion to double? No? Then there's no
point in using size_t, and would let the compiler auto-optimize the
above-referenced comparisons with size_t...
Thanks
Anatol
- Matt
Hi Matt,
Thanks for the comments.
-----Original Message-----
From: Matt Wilmas [mailto:php_lists@realplain.com]
Sent: Tuesday, August 25, 2015 12:06 PM
To: Anatol Belski anatol.php@belski.net; 'Dmitry Stogov'
dmitry@php.net;
'Xinchen Hui' xinchen.h@zend.com; 'Nikita Popov' nikita.ppv@gmail.com;
'Pierre Joye' pierre.php@gmail.com; 'Bob Weinand'
bobwei9@hotmail.com; 'Jakub Zelenka' bukka@php.net
Cc: internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonHi Anatol, Dmitry, all,
----- Original Message -----
From: "Anatol Belski"
Sent: Friday, August 21, 2015Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing
now Jakub's work on catching overflows in ext/openssl and Matt
Williams suggestions on it (which was going a bit more global over
it). So I came up with these macros with two goalsDang, people be gettin' my name wrong, in the same way, even "in writing."
;-P
Ohh, no intention on that, I will save it in my head, Matt Wilmas :)
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
The compiler will already do that, as I've said of course, and more
importantly
(for most cases), it will never get it wrong.
- simplify the checks where external libs (openssl, libxml, etc.) use
firm datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX)
define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endifReally no point in all that...
#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
What's wrong with just doing the check, "as is," instead of hiding it
behind a
[longer] macro? (Also, none of the casts are necessary...?)So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}Compiler would eliminate the branch automatically on 32-bit and ILP64.
Already will. :^)
Some other macros to do signed/unsigned comparison, these can be
extended.#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) >
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong)
((zlong) <
0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size,
zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))Yes, the only case where we might want to do something special is
comparing
an unsigned type, like size_t (string lengths), with a signed max, since
that's not
impossible to the compiler. Of course, you have to KNOW that's what you
want
and/or it's safe to ignore the check, e.g. not dealing with strings > 2 GB
on 32-
bit, etc.And again, a generic way to do that for a "max" of integer type or greater
(since
literals are smallest type >= int that will hold the value), that should
work
anywhere, is:#define ZEND_SIGNED_OVFL_etc(val, max) (sizeof(val) > sizeof(max) && (val)
(max))
Simple and optimizes easily...
In general it's better to not to be optimistic about what compilers will do.
Also it is probably a question of taste. SIZEOF_INT == SIZEOF_ZEND_LONG is
something with zero calculation and guaranteed to exclude any possible side
effect, while sizeof(int) == sizeof(zend_long) depends on the optimization
choosen. Not disputing that your suggestion will do nearly the same job in
the usual case, ofc.
With the casts - probably yes, many cases will go by the standard type
promotion rules, but having the casts only ensures it does what one actually
intended to do and has no effect in the generated code. For example one
would need a cast when comparing with INT_MAX + 1. Worse is with unsigned,
because the result depends on the platform and concrete types and values.
Hiding behind the macros makes full sense for two reasons at least
- there are people with various skills that can just rely on what the API
provides without going too deep into the detail - doing an internal change won't affect anything in the existing usages
IMHO these and maybe more are missing after the 64-bit RFC. Do you
think they would make sense? Or would make sense now, or later in
master?Also, why, exactly, has the zend_string length been defined as size_t
(just
because it's "supposed to" be?), which is incompatible with most
everything in
PHP, instead of zend_long (since we had int before)?e.g. Just look at
strlen()
-- oops, size_t no worky! :-/ Dmitry? We need
range
checking there, right? Conversion to double? No? Then there's no
point in using size_t, and would let the compiler auto-optimize the
above-
referenced comparisons with size_t...
Not sure what you mean here, 32-bit? memory_limit is unsigned 32-bit there,
but even if not - it is hardly possible to have a string exceeding it there
anyway. Otherwise, there is a plenty of libraries working with size_t and it
makes full sense to exhaust that on 64-bit. And the most of PHP7 is just
fine with size_t.
Before the 64-bit RFC my idea was actually to port the dependency libs for
the full 64-bit support, but it's insane. So better to fix the side effects,
as those are almost the same as in PHP5 on most of 64-bit.
Regards
Anatol
Hi Anatol,
Just a quick reply to couple parts... Have to go, and I don't care much
either way about the stuff, just commenting before. :-)
----- Original Message -----
From: "Anatol Belski"
Sent: Tuesday, August 25, 2015
Hi Matt,
Thanks for the comments.
-----Original Message-----
From: Matt Wilmas [mailto:php_lists@realplain.com]
Sent: Tuesday, August 25, 2015 12:06 PM
To: Anatol Belski anatol.php@belski.net; 'Dmitry Stogov'
dmitry@php.net;
'Xinchen Hui' xinchen.h@zend.com; 'Nikita Popov'
nikita.ppv@gmail.com;
'Pierre Joye' pierre.php@gmail.com; 'Bob Weinand'
bobwei9@hotmail.com; 'Jakub Zelenka' bukka@php.net
Cc: internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonHi Anatol, Dmitry, all,
----- Original Message -----
From: "Anatol Belski"
Sent: Friday, August 21, 2015Hi,
Resending this as missed internals at the start.
I was lately rethinking some part of the 64-bit RFC, and also seeing
now Jakub's work on catching overflows in ext/openssl and Matt
Williams suggestions on it (which was going a bit more global over
it). So I came up with these macros with two goalsDang, people be gettin' my name wrong, in the same way, even "in
writing."
;-POhh, no intention on that, I will save it in my head, Matt Wilmas :)
Haha.
- standardize the overflow checks
- do actualy checks only on architectures where it's needed
The compiler will already do that, as I've said of course, and more
importantly
(for most cases), it will never get it wrong.
- simplify the checks where external libs (openssl, libxml, etc.) use
firm datatypes like int#if SIZEOF_INT == SIZEOF_ZEND_LONG
define ZEND_LONG_INT_OVFL(zl) (0)
define ZEND_LONG_INT_UDFL(zl) (0)
#else
define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX)
define
ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endifReally no point in all that...
#define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)
What's wrong with just doing the check, "as is," instead of hiding it
behind a
[longer] macro? (Also, none of the casts are necessary...?)So having it like
If (ZEND_LONG_INT_OVFL(x)) {
return;
}Compiler would eliminate the branch automatically on 32-bit and ILP64.
Already will. :^)
Some other macros to do signed/unsigned comparison, these can be
extended.#define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) >
(size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong)
((zlong) <
0
|| (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size,
zlong)
((zlong) >= 0 && (size) < (size_t)(zlong)) #define
ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
(size_t)(zlong))Yes, the only case where we might want to do something special is
comparing
an unsigned type, like size_t (string lengths), with a signed max, since
that's not
impossible to the compiler. Of course, you have to KNOW that's what you
want
and/or it's safe to ignore the check, e.g. not dealing with strings > 2
GB
on 32-
bit, etc.And again, a generic way to do that for a "max" of integer type or
greater
(since
literals are smallest type >= int that will hold the value), that should
work
anywhere, is:#define ZEND_SIGNED_OVFL_etc(val, max) (sizeof(val) > sizeof(max) &&
(val)(max))
Simple and optimizes easily...
In general it's better to not to be optimistic about what compilers will
do.
I didn't reply in last week's thread about the overflow checks in OpenSSL...
But it is definitely fine to be optimistic and rely on compiler to do
basic, basic stuff like this. No reason to make things more complicated
just to think one is helping the compiler.
Proof? Have you seen the whole VM in the last, like, 10 years? Using your
logic, we have major, major, MAJOR problems literally everywhere there!
Now can anyone point to problems compiling the default, specialized VM? I
rest my case. :-) There's conditional stuff ALL over the place (not
macros), and propagated to inline functions, etc. where the compiler is
"counted on" to remove the constant conditions and all that stuff.
The current FAST_ZPP parameter parsing, any unnecessary parts are optimized
out (it'd be bad if that wasn't the case). My new proposal (coming soon...)
has a lot of more complicated-looking source code, but almost all deleted by
the compiler.
Transforming zpp's type spec string at compile time, which MSVC can't do
because it does indeed suck for some reason, involves dozens of KB of
source, many inline functions, etc. but nearly 100% of it is deleted,
correctly, by GCC and Clang.
Anyway, that's why I say if "some" compiler can't do basic optimizations, it
sucks, and has much bigger problems in the rest of the source, so there's
no reason to worry about it... At all.
Also it is probably a question of taste. SIZEOF_INT == SIZEOF_ZEND_LONG is
something with zero calculation and guaranteed to exclude any possible
side
effect, while sizeof(int) == sizeof(zend_long) depends on the optimization
choosen. Not disputing that your suggestion will do nearly the same job in
the usual case, ofc.With the casts - probably yes, many cases will go by the standard type
promotion rules, but having the casts only ensures it does what one
actually
intended to do and has no effect in the generated code. For example one
would need a cast when comparing with INT_MAX + 1. Worse is with
unsigned,
because the result depends on the platform and concrete types and values.Hiding behind the macros makes full sense for two reasons at least
- there are people with various skills that can just rely on what the API
provides without going too deep into the detail- doing an internal change won't affect anything in the existing usages
IMHO these and maybe more are missing after the 64-bit RFC. Do you
think they would make sense? Or would make sense now, or later in
master?Also, why, exactly, has the zend_string length been defined as size_t
(just
because it's "supposed to" be?), which is incompatible with most
everything in
PHP, instead of zend_long (since we had int before)?e.g. Just look at
strlen()
-- oops, size_t no worky! :-/ Dmitry? We
need
range
checking there, right? Conversion to double? No? Then there's no
point in using size_t, and would let the compiler auto-optimize the
above-
referenced comparisons with size_t...Not sure what you mean here, 32-bit?
I mean, PHP's strlen()
on any platform, why is it possible to have larger
strings (size_t length) than can be returned in zend_long. That's a bug.
memory_limit is unsigned 32-bit there,
but even if not - it is hardly possible to have a string exceeding it
there
anyway.
Not possible? Then like I said, why use size_t? It would help some of
these things to get rid of it.
Otherwise, there is a plenty of libraries working with size_t and it
makes full sense to exhaust that on 64-bit. And the most of PHP7 is just
fine with size_t.
So? Why does it matter if libraries are using size_t? The same libraries
always have been and passed an int (from string length) before PHP 7. Pass
them zend_long instead now...
Before the 64-bit RFC my idea was actually to port the dependency libs for
the full 64-bit support, but it's insane. So better to fix the side
effects,
as those are almost the same as in PHP5 on most of 64-bit.Regards
Anatol
- Matt
-----Original Message-----
From: Matt Wilmas [mailto:php_lists@realplain.com]
Sent: Tuesday, August 25, 2015 2:19 PM
To: Anatol Belski anatol.php@belski.net; 'Dmitry Stogov'
dmitry@php.net;
'Xinchen Hui' xinchen.h@zend.com; 'Nikita Popov' nikita.ppv@gmail.com;
'Pierre Joye' pierre.php@gmail.com; 'Bob Weinand'
bobwei9@hotmail.com; 'Jakub Zelenka' bukka@php.net
Cc: internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparisonBut it is definitely fine to be optimistic and rely on compiler to do
basic, basic
stuff like this. No reason to make things more complicated just to think
one is
helping the compiler.Proof? Have you seen the whole VM in the last, like, 10 years? Using
your logic,
we have major, major, MAJOR problems literally everywhere there!Now can anyone point to problems compiling the default, specialized VM? I
rest
my case. :-) There's conditional stuff ALL over the place (not macros),
and
propagated to inline functions, etc. where the compiler is "counted on" to
remove the constant conditions and all that stuff.The current FAST_ZPP parameter parsing, any unnecessary parts are
optimized
out (it'd be bad if that wasn't the case). My new proposal (coming
soon...) has a
lot of more complicated-looking source code, but almost all deleted by the
compiler.Transforming zpp's type spec string at compile time, which MSVC can't do
because it does indeed suck for some reason, involves dozens of KB of
source,
many inline functions, etc. but nearly 100% of it is deleted, correctly,
by GCC and
Clang.Anyway, that's why I say if "some" compiler can't do basic optimizations,
it
sucks, and has much bigger problems in the rest of the source, so
there's no
reason to worry about it... At all.
An #ifdef seems just a bit more readable, less tricky. Clear if a compiler
tells it implements some standard like C89, it should be trustful. Or a
feature like dead code elimination. Some projects even use not standard
compiler features depending on which platform they are in much bigger extent
than PHP. My sentence about not being optimistic is about not trusting
blindly on that, not ignoring useful things.
I mean, PHP's
strlen()
on any platform, why is it possible to have larger
strings
(size_t length) than can be returned in zend_long. That's a bug.memory_limit is unsigned 32-bit there, but even if not - it is hardly
possible to have a string exceeding it there anyway.Not possible? Then like I said, why use size_t? It would help some of
these
things to get rid of it.
It is a "theoretical" bug. No issue on 32-bit, on 64-bit you would need to
have a system with over 10^9 GB of RAM to come to it :)
Otherwise, there is a plenty of libraries working with size_t and it
makes full sense to exhaust that on 64-bit. And the most of PHP7 is
just fine with size_t.So? Why does it matter if libraries are using size_t? The same libraries
always
have been and passed an int (from string length) before PHP 7. Pass them
zend_long instead now...
One aspect is because of 64-bit, to exhaust the potential of the platform.
But it is again more general discussion whether PHP is a general purpose or
web dedicated language.
Regards
Anatol
I didn't reply in last week's thread about the overflow checks in
OpenSSL...But it is definitely fine to be optimistic and rely on compiler to do
basic, basic stuff like this. No reason to make things more complicated
just to think one is helping the compiler.Proof? Have you seen the whole VM in the last, like, 10 years? Using
your logic, we have major, major, MAJOR problems literally everywhere there!Now can anyone point to problems compiling the default, specialized VM?
I rest my case. :-) There's conditional stuff ALL over the place (not
macros), and propagated to inline functions, etc. where the compiler is
"counted on" to remove the constant conditions and all that stuff.The current FAST_ZPP parameter parsing, any unnecessary parts are
optimized out (it'd be bad if that wasn't the case). My new proposal
(coming soon...) has a lot of more complicated-looking source code, but
almost all deleted by the compiler.
Transforming zpp's type spec string at compile time, which MSVC can't do
because it does indeed suck for some reason, involves dozens of KB of
source, many inline functions, etc. but nearly 100% of it is deleted,
correctly, by GCC and Clang.Anyway, that's why I say if "some" compiler can't do basic optimizations,
it sucks, and has much bigger problems in the rest of the source, so
there's no reason to worry about it... At all.
Let me disagree on this comment. Most of it and its wording.
Also it is probably a question of taste. SIZEOF_INT == SIZEOF_ZEND_LONG
is
something with zero calculation and guaranteed to exclude any possible
side
effect, while sizeof(int) == sizeof(zend_long) depends on the
optimization
choosen. Not disputing that your suggestion will do nearly the same job
in
the usual case, ofc.With the casts - probably yes, many cases will go by the standard type
promotion rules, but having the casts only ensures it does what one
actually
intended to do and has no effect in the generated code. For example one
would need a cast when comparing with INT_MAX + 1. Worse is with
unsigned,
because the result depends on the platform and concrete types and values.Hiding behind the macros makes full sense for two reasons at least
- there are people with various skills that can just rely on what the API
provides without going too deep into the detail- doing an internal change won't affect anything in the existing usages
IMHO these and maybe more are missing after the 64-bit RFC. Do you
think they would make sense? Or would make sense now, or later inmaster?
Also, why, exactly, has the zend_string length been defined as size_t
(just
because it's "supposed to" be?), which is incompatible with most
everything in
PHP, instead of zend_long (since we had int before)?
e.g. Just look at
strlen()
-- oops, size_t no worky! :-/ Dmitry? We
needrange
checking there, right? Conversion to double? No? Then there's no
point in using size_t, and would let the compiler auto-optimize theabove-
referenced comparisons with size_t...
Not sure what you mean here, 32-bit?
I mean, PHP's
strlen()
on any platform, why is it possible to have larger
strings (size_t length) than can be returned in zend_long. That's a bug.
It is wrong to use long for portability. It is also wrong (very) to use
int* for length of a buffer. It is a very well known good practice (correct
way) to use size_t, even more when interacting with external sources.
I have to disagree about your statement here too.
So? Why does it matter if libraries are using size_t? The same
libraries always have been and passed an int (from string length) before
PHP 7. Pass them zend_long instead now...
And we have issues because of this exact case. It also helps to use the
right type for a given usage. In this case size_t.
Before the 64-bit RFC my idea was actually to port the dependency libs
for
the full 64-bit support, but it's insane. So better to fix the side
effects,
as those are almost the same as in PHP5 on most of 64-bit.Regards
Anatol
- Matt