Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:71435 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 32395 invoked from network); 23 Jan 2014 12:37:33 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Jan 2014 12:37:33 -0000 Authentication-Results: pb1.pair.com header.from=dmitry@zend.com; sender-id=unknown Authentication-Results: pb1.pair.com smtp.mail=dmitry@zend.com; spf=permerror; sender-id=unknown Received-SPF: error (pb1.pair.com: domain zend.com from 74.125.82.169 cause and error) X-PHP-List-Original-Sender: dmitry@zend.com X-Host-Fingerprint: 74.125.82.169 mail-we0-f169.google.com Received: from [74.125.82.169] ([74.125.82.169:38420] helo=mail-we0-f169.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 1E/72-18911-C0D01E25 for ; Thu, 23 Jan 2014 07:37:33 -0500 Received: by mail-we0-f169.google.com with SMTP id u57so1086524wes.28 for ; Thu, 23 Jan 2014 04:37:30 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=fbX2a7ykqdy3DI3s4A1cfnsdii5rYReMiuhOtwaKz4E=; b=EigIj1MsyXW8TIk79IOhpJtAQRzj4ss4PFY3WrygCj8qmfLHrOEmLbvsr9aIDvq4UI LlJJbUrgnSvIlLBVRp5NGjzncoZQ+Uxv99iR1YsIAu9FgaG1m2ewpAlOV+PeZq0cfzAh 6cEkKEFUF5J692o0vMIZ71PUIXHo2TbAtK+9XNm0AzdNKhuDNf7s/gX30ZRd5UHcaG+P hyl+v9Sa1piOWJKph+lba9UOFEfzbQxyZMEuni7wJJdKfXBEYiuPfgmpalcdHG1fiegO qmYK7crN727vpko+cMu7ei6De8V9KJ2htGAr6wU8T2s1Ou56+eJbFpeBir8jKu43mrkf 7Xqw== X-Gm-Message-State: ALoCoQnUywbWCV5ZdiluHp4o7V+OZTexGgKGzkrBZd1kcuktgXlXRlOIbgKVxuEI/4jrBl9aX/AVNsfyI6Ne3jf20gMdKQMKYhyak+o8GFi6SwKx2/np6zw5F5y756g69V8NyQBDRk65 MIME-Version: 1.0 X-Received: by 10.180.38.98 with SMTP id f2mr19109744wik.17.1390480650385; Thu, 23 Jan 2014 04:37:30 -0800 (PST) Received: by 10.227.91.196 with HTTP; Thu, 23 Jan 2014 04:37:30 -0800 (PST) In-Reply-To: References: Date: Thu, 23 Jan 2014 16:37:30 +0400 Message-ID: To: Anatol Belski Cc: Nikita Popov , PHP Developers Mailing List Content-Type: multipart/alternative; boundary=e89a8f6433783c0f2504f0a2818a Subject: Re: [PHP-DEV] [RFC] 64 bit platform improvements for string length and integer From: dmitry@zend.com (Dmitry Stogov) --e89a8f6433783c0f2504f0a2818a Content-Type: text/plain; charset=UTF-8 Hi Anatol, On Thu, Jan 23, 2014 at 3:18 PM, Anatol Belski wrote: > Hi Dmitry, > > greateful thanks for taking a look at this patch. > > On Thu, January 23, 2014 09:42, Dmitry Stogov wrote: > > I completely agree with Nikita. > > Why to rename LONG->INT STRLEN->STRSIZE in thousands places? > > Why not just define zend_long and zend_ulong to be 64-bit on 64-bit > > platforms and use them instead of int, ulint, zend_int, zend_uint, long, > > ulong where it's necessary. > > > > Anatol, I understood your point about catching incompatibility code at > > compile-time, but I'm not sure if the new features cost such huge code > > base changes. > Firstly it's the historical reason as while porting it "had" to break so > one could easy see the relevant places. After that - it's also a > transformation in the mind, as that int placeholders do not depend on a > fixed datatype anymore. > historical reason of the patch itself doesn't matter :) If there's no significant reason to introduce new names lets use old ones. Even with new names anyone is able to change macros but not types. It's better to use some smarter compile-time protection. I would try to play with "compile time assert" (in GCC it must be even possible to use __builtin_types_compatible_b()). For example: #define CONCAT_TOKENS(a, b) a ## b #define EXPAND_THEN_CONCAT(a, b) CONCAT_TOKENS(a, b) #define COMPILE_TIME_ASSERT(expr) do { \ enum { EXPAND_THEN_CONCAT(ASSERT_line_,__LINE__) = 1 / !!(expr) }; \ } while(0) int main() { int a; COMPILE_TIME_ASSERT(__builtin_types_compatible_p(typeof(a), long)); COMPILE_TIME_ASSERT(sizeof(a) == 8); return 0; } > > > 1) 64-bit integers on Windows (they are already 64-bit on other systems) > > 2) 64-bit string length. I don't think many people are interested in > that. > > Fortunately, the patch doesn't change the zval size, so it shouldn't > > make a lot of harm. However, usage of "zend_size_t" instead of "int" is a > > bit annoying. I would change it into the same "zend_long" or > "zend_ulong". > > > The original patch was for size_t only. With only that it were Linux/Unix > only improvement, as size_t is 64 bit on Windows so it'd have to stay int > or become just unsigned. Omitting the size_t change and doing int64 were > only improvement on Windows. Adding both is the three-way improvement - > Linux and Windows with biggest possible strings, Windows with 64 bit > integers. So in fact, doing only one of those wouldn't IMHO justify all > the effort. > After some thoughts I think that usage of "size_t" is a good thing for the future support of X32 ABI. X86: sizeof(int)=4 sizeof(long)=4 sizeof(sizet_t)=4 X86-64 (Linux): sizeof(int)=4 sizeof(long)=8 sizeof(sizet_t)=8 X86-64 (Win64): sizeof(int)=4 sizeof(long)=4 sizeof(sizet_t)=8 X32: sizeof(int)=4 sizeof(long)=8 sizeof(sizet_t)=4 > > Additional Windows improvement "for free" is the whole file API > exhausting, so large file objects and offsets. > > That's true, the possibility to process gigabytes of data in memory will > not be needed every day, however the presence of it is something else. > Like, why should I be interested on something not available anyway? > > Keeping size_t separated semantically is good for several reasons. It's > clean with the specification. Should it come to 64 bit integer on 32 bit > platform, it's easier to continue (merging size_t and ulong would break > this option). And, one day it can come to 128 bit integers (for what > reasons ever). I know, it's science fiction now, but wasn't a RAM size of > 1Gb so 10 years ago? You never know. So size_t separated from unsigned int > is a good thing imho and keeps some interesting options open for the > future. > Agreed. > At the end line, the new vs. old names is the last thing I personally > would ultimately hang on, given the essential modification is in place. > However the code clearly expressing what happens is something I'd call > more appropriate with such a big substantial change. > It's questionable. I' mot sure what names are better but big changes are always annoying. May be it makes sense to extend votiing with options ("with renaming" and "without renaming") Thanks. Dmitry. > > Best regards > > Anatol > > > > --e89a8f6433783c0f2504f0a2818a--