Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:71429 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 21161 invoked from network); 23 Jan 2014 11:18:45 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Jan 2014 11:18:45 -0000 Authentication-Results: pb1.pair.com header.from=ab@php.net; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=ab@php.net; spf=unknown; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain php.net does not designate 85.214.73.107 as permitted sender) X-PHP-List-Original-Sender: ab@php.net X-Host-Fingerprint: 85.214.73.107 klapt.com Received: from [85.214.73.107] ([85.214.73.107:44902] helo=klapt.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 6D/30-18911-29AF0E25 for ; Thu, 23 Jan 2014 06:18:45 -0500 Received: by klapt.com (Postfix, from userid 33) id C108023D60EC; Thu, 23 Jan 2014 12:18:39 +0100 (CET) Received: from 178.10.239.203 (SquirrelMail authenticated user anatol@belski.net) by webmail.klapt.com with HTTP; Thu, 23 Jan 2014 12:18:39 +0100 Message-ID: In-Reply-To: References: Date: Thu, 23 Jan 2014 12:18:39 +0100 To: "Dmitry Stogov" Cc: "Nikita Popov" , "PHP Developers Mailing List" Reply-To: "Anatol Belski" User-Agent: SquirrelMail/1.5.2 [SVN] MIME-Version: 1.0 Content-Type: text/plain;charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: Re: [PHP-DEV] [RFC] 64 bit platform improvements for string length and integer From: ab@php.net ("Anatol Belski") Hi Dmitry, greateful thanks for taking a look at this patch. On Thu, January 23, 2014 09:42, Dmitry Stogov wrote: > I completely agree with Nikita. > Why to rename LONG->INT STRLEN->STRSIZE in thousands places? > Why not just define zend_long and zend_ulong to be 64-bit on 64-bit > platforms and use them instead of int, ulint, zend_int, zend_uint, long, > ulong where it's necessary. > > Anatol, I understood your point about catching incompatibility code at > compile-time, but I'm not sure if the new features cost such huge code > base changes. Firstly it's the historical reason as while porting it "had" to break so one could easy see the relevant places. After that - it's also a transformation in the mind, as that int placeholders do not depend on a fixed datatype anymore. > 1) 64-bit integers on Windows (they are already 64-bit on other systems) > 2) 64-bit string length. I don't think many people are interested in that. > Fortunately, the patch doesn't change the zval size, so it shouldn't > make a lot of harm. However, usage of "zend_size_t" instead of "int" is a > bit annoying. I would change it into the same "zend_long" or "zend_ulong". > The original patch was for size_t only. With only that it were Linux/Unix only improvement, as size_t is 64 bit on Windows so it'd have to stay int or become just unsigned. Omitting the size_t change and doing int64 were only improvement on Windows. Adding both is the three-way improvement - Linux and Windows with biggest possible strings, Windows with 64 bit integers. So in fact, doing only one of those wouldn't IMHO justify all the effort. Additional Windows improvement "for free" is the whole file API exhausting, so large file objects and offsets. That's true, the possibility to process gigabytes of data in memory will not be needed every day, however the presence of it is something else. Like, why should I be interested on something not available anyway? Keeping size_t separated semantically is good for several reasons. It's clean with the specification. Should it come to 64 bit integer on 32 bit platform, it's easier to continue (merging size_t and ulong would break this option). And, one day it can come to 128 bit integers (for what reasons ever). I know, it's science fiction now, but wasn't a RAM size of 1Gb so 10 years ago? You never know. So size_t separated from unsigned int is a good thing imho and keeps some interesting options open for the future. At the end line, the new vs. old names is the last thing I personally would ultimately hang on, given the essential modification is in place. However the code clearly expressing what happens is something I'd call more appropriate with such a big substantial change. Best regards Anatol