Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:71438 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 40742 invoked from network); 23 Jan 2014 14:21:58 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Jan 2014 14:21:58 -0000 Authentication-Results: pb1.pair.com header.from=ab@php.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=ab@php.net; spf=unknown; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain php.net does not designate 85.214.73.107 as permitted sender) X-PHP-List-Original-Sender: ab@php.net X-Host-Fingerprint: 85.214.73.107 klapt.com Received: from [85.214.73.107] ([85.214.73.107:46848] helo=klapt.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id B2/10-39789-38521E25 for ; Thu, 23 Jan 2014 09:21:57 -0500 Received: by klapt.com (Postfix, from userid 33) id 27CD523D60EC; Thu, 23 Jan 2014 15:21:53 +0100 (CET) Received: from 178.10.239.203 (SquirrelMail authenticated user anatol@belski.net) by webmail.klapt.com with HTTP; Thu, 23 Jan 2014 15:21:53 +0100 Message-ID: <7742902851ab8ce48f58e1f9559b9b7e.squirrel@webmail.klapt.com> In-Reply-To: References: Date: Thu, 23 Jan 2014 15:21:53 +0100 To: "Dmitry Stogov" Cc: "Nikita Popov" , "PHP Developers Mailing List" Reply-To: "Anatol Belski" User-Agent: SquirrelMail/1.5.2 [SVN] MIME-Version: 1.0 Content-Type: text/plain;charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] [RFC] 64 bit platform improvements for string length and integer From: ab@php.net ("Anatol Belski") Hi Dmitry, On Thu, January 23, 2014 13:37, Dmitry Stogov wrote: > historical reason of the patch itself doesn't matter :) If there's no > significant reason to introduce new names lets use old ones. > > Even with new names anyone is able to change macros but not types. > It's better to use some smarter compile-time protection. > I would try to play with "compile time assert" (in GCC it must be even > possible to use __builtin_types_compatible_b()). For example: > > > #define CONCAT_TOKENS(a, b) a ## b > #define EXPAND_THEN_CONCAT(a, b) CONCAT_TOKENS(a, b) > #define COMPILE_TIME_ASSERT(expr) do { \ > enum { EXPAND_THEN_CONCAT(ASSERT_line_,__LINE__) = 1 / !!(expr) }; \ > } while(0) > > > int main() { > int a; > > COMPILE_TIME_ASSERT(__builtin_types_compatible_p(typeof(a), long)); > COMPILE_TIME_ASSERT(sizeof(a) == 8); > return 0; } > That is what also Gopal pointed me to with this live example https://github.com/bagder/curl/blob/master/include/curl/typecheck-gcc.h There's a similar functionality with Visual C++ http://msdn.microsoft.com/en-us/library/dd537655.aspx but I'm not sure it's C++ only. Will have to try. Generally why I left the idea - as it were only clean for new implementation, with the old implementation (that compat.h stuff) one would need to use #undef which looks like a hack. Another thing is that iterating through the param spec and comparing it with the passed args on compile time were another tricky job. Now where you say it, I think that check is at least worth a try and even being in the new PHP only it would be a great thing. So putting on the TODO, as the most stuff is ported now. > >> >>> 1) 64-bit integers on Windows (they are already 64-bit on other >>> systems) 2) 64-bit string length. I don't think many people are >>> interested in >> that. >>> Fortunately, the patch doesn't change the zval size, so it shouldn't >>> make a lot of harm. However, usage of "zend_size_t" instead of "int" >>> is a bit annoying. I would change it into the same "zend_long" or >> "zend_ulong". >> >>> >> The original patch was for size_t only. With only that it were >> Linux/Unix >> only improvement, as size_t is 64 bit on Windows so it'd have to stay >> int or become just unsigned. Omitting the size_t change and doing int64 >> were only improvement on Windows. Adding both is the three-way >> improvement - Linux and Windows with biggest possible strings, Windows >> with 64 bit integers. So in fact, doing only one of those wouldn't IMHO >> justify all the effort. >> > > After some thoughts I think that usage of "size_t" is a good thing for > the future support of X32 ABI. > > X86: sizeof(int)=4 sizeof(long)=4 sizeof(sizet_t)=4 > X86-64 (Linux): sizeof(int)=4 sizeof(long)=8 sizeof(sizet_t)=8 > X86-64 (Win64): sizeof(int)=4 sizeof(long)=4 sizeof(sizet_t)=8 > X32: sizeof(int)=4 sizeof(long)=8 sizeof(sizet_t)=4 > Ok, there is even more platform mess out here :) > >> >> Additional Windows improvement "for free" is the whole file API >> exhausting, so large file objects and offsets. >> >> That's true, the possibility to process gigabytes of data in memory >> will not be needed every day, however the presence of it is something >> else. Like, why should I be interested on something not available >> anyway? >> >> Keeping size_t separated semantically is good for several reasons. It's >> clean with the specification. Should it come to 64 bit integer on 32 >> bit platform, it's easier to continue (merging size_t and ulong would >> break this option). And, one day it can come to 128 bit integers (for >> what reasons ever). I know, it's science fiction now, but wasn't a RAM >> size of 1Gb so 10 years ago? You never know. So size_t separated from >> unsigned int is a good thing imho and keeps some interesting options >> open for the future. >> > > Agreed. > > > >> At the end line, the new vs. old names is the last thing I personally >> would ultimately hang on, given the essential modification is in place. >> However the code clearly expressing what happens is something I'd call >> more appropriate with such a big substantial change. >> > > It's questionable. I' mot sure what names are better but big changes are > always annoying. May be it makes sense to extend votiing with options > ("with renaming" and > "without renaming") > Yeah, independent from the concrete names, it's about keeping balance between the real functionality and annoyance. I've split options in the RFC into 4 separate votes so far. All yes/no options about the patch itself, zpp, renamed macros and SAPIs. As all that zend_*_t and php_*_t types didn't exist before, would you like suggest other names? Please see the RFC, there are also type replacements for off_t and struct stat. Thanks Anatol