Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:87905 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 27085 invoked from network); 25 Aug 2015 12:19:06 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 25 Aug 2015 12:19:06 -0000 Authentication-Results: pb1.pair.com smtp.mail=php_lists@realplain.com; spf=permerror; sender-id=unknown Authentication-Results: pb1.pair.com header.from=php_lists@realplain.com; sender-id=unknown Received-SPF: error (pb1.pair.com: domain realplain.com from 68.114.190.26 cause and error) X-PHP-List-Original-Sender: php_lists@realplain.com X-Host-Fingerprint: 68.114.190.26 mtaout001-public.msg.strl.va.charter.net Received: from [68.114.190.26] ([68.114.190.26:36393] helo=mtaout001.msg.strl.va.charter.net) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 07/22-05482-53D5CD55 for ; Tue, 25 Aug 2015 08:19:04 -0400 Received: from impout002 ([68.114.189.17]) by mtaout001.msg.strl.va.charter.net (InterMail vM.9.00.020.01 201-2473-160) with ESMTP id <20150825121859.MKII28847.mtaout001.msg.strl.va.charter.net@impout002>; Tue, 25 Aug 2015 07:18:59 -0500 Received: from pc1 ([96.35.251.86]) by impout002 with charter.net id 90Jw1r0061sc0so010Jw24; Tue, 25 Aug 2015 07:18:59 -0500 X-Authority-Analysis: v=2.1 cv=Lrgcyxtc c=1 sm=1 tr=0 a=Is5gsZaFXO8aPum+t7Tz+g==:117 a=Is5gsZaFXO8aPum+t7Tz+g==:17 a=hOpmn2quAAAA:8 a=BCPeO_TGAAAA:8 a=8nJEP1OIZ-IA:10 a=9a7puj6YAAAA:8 a=67BIL_jfAAAA:8 a=lRBSXoDDAAAA:8 a=pGLkceISAAAA:8 a=69EAbJreAAAA:8 a=Felypnej_KhW6vzBvA8A:9 a=E_3nUM1A51EIk37Y:21 a=rsOBSX6PYFOXvM7N:21 a=wPNLvfGTeEIA:10 Message-ID: <93061F8216C14196896DAC28DEE55982@pc1> To: "Anatol Belski" , "'Dmitry Stogov'" , "'Xinchen Hui'" , "'Nikita Popov'" , "'Pierre Joye'" , "'Bob Weinand'" , "'Jakub Zelenka'" Cc: References: <02a601d0dbed$2c828df0$8587a9d0$@belski.net> <01b801d0df28$fccbe090$f663a1b0$@belski.net> Date: Tue, 25 Aug 2015 07:18:56 -0500 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 Subject: Re: [PHP-DEV] Overflow checks and integral vars comparison From: php_lists@realplain.com ("Matt Wilmas") Hi Anatol, Just a quick reply to couple parts... Have to go, and I don't care much either way about the stuff, just commenting before. :-) ----- Original Message ----- From: "Anatol Belski" Sent: Tuesday, August 25, 2015 > Hi Matt, > > Thanks for the comments. > >> -----Original Message----- >> From: Matt Wilmas [mailto:php_lists@realplain.com] >> Sent: Tuesday, August 25, 2015 12:06 PM >> To: Anatol Belski ; 'Dmitry Stogov' > ; >> 'Xinchen Hui' ; 'Nikita Popov' >> ; >> 'Pierre Joye' ; 'Bob Weinand' >> ; 'Jakub Zelenka' >> Cc: internals@lists.php.net >> Subject: Re: [PHP-DEV] Overflow checks and integral vars comparison >> >> Hi Anatol, Dmitry, all, >> >> ----- Original Message ----- >> From: "Anatol Belski" >> Sent: Friday, August 21, 2015 >> >> > Hi, >> > >> > Resending this as missed internals at the start. >> > >> > I was lately rethinking some part of the 64-bit RFC, and also seeing >> > now Jakub's work on catching overflows in ext/openssl and Matt >> > Williams suggestions on it (which was going a bit more global over >> > it). So I came up with these macros with two goals >> >> Dang, people be gettin' my name wrong, in the same way, even "in >> writing." >> ;-P >> > Ohh, no intention on that, I will save it in my head, Matt Wilmas :) Haha. >> > - standardize the overflow checks >> > - do actualy checks only on architectures where it's needed >> >> The compiler will already do that, as I've said of course, and more > importantly >> (for most cases), it will never get it wrong. >> >> > - simplify the checks where external libs (openssl, libxml, etc.) use >> > firm datatypes like int >> > >> > #if SIZEOF_INT == SIZEOF_ZEND_LONG >> > # define ZEND_LONG_INT_OVFL(zl) (0) >> > # define ZEND_LONG_INT_UDFL(zl) (0) >> > #else >> > # define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX) # >> > define >> > ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif >> >> Really no point in all that... >> >> > #define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX) >> >> What's wrong with just doing the check, "as is," instead of hiding it > behind a >> [longer] macro? (Also, none of the casts are necessary...?) >> >> > So having it like >> > >> > If (ZEND_LONG_INT_OVFL(x)) { >> > return; >> > } >> > >> > Compiler would eliminate the branch automatically on 32-bit and ILP64. >> >> Already will. :^) >> >> > Some other macros to do signed/unsigned comparison, these can be >> extended. >> > >> > #define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) > >> > (size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong) >> > ((zlong) < >> > 0 >> > || (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size, >> > zlong) >> > ((zlong) >= 0 && (size) < (size_t)(zlong)) #define >> > ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <= >> > (size_t)(zlong)) >> >> Yes, the only case where we might want to do something special is > comparing >> an unsigned type, like size_t (string lengths), with a signed max, since > that's not >> impossible to the compiler. Of course, you have to KNOW that's what you > want >> and/or it's safe to ignore the check, e.g. not dealing with strings > 2 >> GB > on 32- >> bit, etc. >> >> And again, a generic way to do that for a "max" of integer type or >> greater > (since >> literals are smallest type >= int that will hold the value), that should > work >> anywhere, is: >> >> #define ZEND_SIGNED_OVFL_etc(val, max) (sizeof(val) > sizeof(max) && >> (val) >> >> (max)) >> >> Simple and optimizes easily... >> > In general it's better to not to be optimistic about what compilers will > do. I didn't reply in last week's thread about the overflow checks in OpenSSL... But it is *definitely* fine to be optimistic and rely on compiler to do basic, basic stuff like this. No reason to make things more complicated just to think one is helping the compiler. Proof? Have you seen the whole VM in the last, like, 10 years? Using your logic, we have major, major, MAJOR problems literally everywhere there! Now can anyone point to problems compiling the default, specialized VM? I rest my case. :-) There's conditional stuff ALL over the place (not macros), and propagated to inline functions, etc. where the compiler is "counted on" to remove the constant conditions and all that stuff. The current FAST_ZPP parameter parsing, any unnecessary parts are optimized out (it'd be bad if that wasn't the case). My new proposal (coming soon...) has a lot of more complicated-looking source code, but almost all deleted by the compiler. Transforming zpp's type spec string at compile time, which MSVC can't do because it does indeed suck for some reason, involves *dozens* of KB of source, many inline functions, etc. but nearly 100% of it is deleted, correctly, by GCC and Clang. Anyway, that's why I say if "some" compiler can't do basic optimizations, it *sucks*, and has much bigger problems in the rest of the source, so there's no reason to worry about it... At all. > Also it is probably a question of taste. SIZEOF_INT == SIZEOF_ZEND_LONG is > something with zero calculation and guaranteed to exclude any possible > side > effect, while sizeof(int) == sizeof(zend_long) depends on the optimization > choosen. Not disputing that your suggestion will do nearly the same job in > the usual case, ofc. > > With the casts - probably yes, many cases will go by the standard type > promotion rules, but having the casts only ensures it does what one > actually > intended to do and has no effect in the generated code. For example one > would need a cast when comparing with INT_MAX + 1. Worse is with > unsigned, > because the result depends on the platform and concrete types and values. > > Hiding behind the macros makes full sense for two reasons at least > > - there are people with various skills that can just rely on what the API > provides without going too deep into the detail > - doing an internal change won't affect anything in the existing usages > >> > IMHO these and maybe more are missing after the 64-bit RFC. Do you >> > think they would make sense? Or would make sense now, or later in > master? >> >> Also, why, exactly, has the zend_string length been defined as size_t > (just >> because it's "supposed to" be?), which is incompatible with most > everything in >> PHP, instead of zend_long (since we had int before)? >> >> e.g. Just look at strlen() -- oops, size_t no worky! :-/ Dmitry? We >> need > range >> checking there, right? Conversion to double? No? Then there's *no >> point* in using size_t, and would let the compiler auto-optimize the > above- >> referenced comparisons with size_t... >> > Not sure what you mean here, 32-bit? I mean, PHP's strlen() on any platform, why is it possible to have larger strings (size_t length) than can be returned in zend_long. That's a bug. > memory_limit is unsigned 32-bit there, > but even if not - it is hardly possible to have a string exceeding it > there > anyway. Not possible? Then like I said, why use size_t? It would help some of these things to get rid of it. > Otherwise, there is a plenty of libraries working with size_t and it > makes full sense to exhaust that on 64-bit. And the most of PHP7 is just > fine with size_t. So? Why does it matter if libraries are using size_t? The same libraries always have been and passed an int (from string length) before PHP 7. Pass them zend_long instead now... > Before the 64-bit RFC my idea was actually to port the dependency libs for > the full 64-bit support, but it's insane. So better to fix the side > effects, > as those are almost the same as in PHP5 on most of 64-bit. > > Regards > > Anatol - Matt