Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:74243 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 64503 invoked from network); 16 May 2014 03:31:11 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 16 May 2014 03:31:11 -0000 Authentication-Results: pb1.pair.com smtp.mail=pierre.php@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=pierre.php@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.216.173 as permitted sender) X-PHP-List-Original-Sender: pierre.php@gmail.com X-Host-Fingerprint: 209.85.216.173 mail-qc0-f173.google.com Received: from [209.85.216.173] ([209.85.216.173:43544] helo=mail-qc0-f173.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 41/C0-58167-C7685735 for ; Thu, 15 May 2014 23:31:09 -0400 Received: by mail-qc0-f173.google.com with SMTP id i8so3387845qcq.18 for ; Thu, 15 May 2014 20:31:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=00xYHMbTgHqKirZkkVSTaDrxrMqkyaVwQ0+J3WpAL48=; b=dleFCC2gejm+DJeOa1J7hYPqeIq6R2Tu2/2Ve9DXORI+oWNhRov8WaFC3oVhJH/XQ8 L8GHp8EGOuCOucOyXMkZ6OoaMg0cBSI5bknj11YGwRTy+B3UhxZBV9bO4A6vhk7NJUC6 mt/kK0CLaC4l22zuT/LXXU73I+vVeNNuXEzFPeeTEbbaxfxi7hB8x7TcIzUx2KN0wuft gcTDYliCDFsxyhfZ+hcpdwGe1BQsAys0c4EqJe62SEzke87Ldd9TG+Hoq7RB6G3xOz5Z TSdNYy+21sgzFey1onz06x0ypdZae0o9x8AfaAAz6Fh38JhI4nZeQXWWNCapW0+iH596 C5aw== MIME-Version: 1.0 X-Received: by 10.229.239.4 with SMTP id ku4mr21508428qcb.17.1400211066026; Thu, 15 May 2014 20:31:06 -0700 (PDT) Received: by 10.140.47.231 with HTTP; Thu, 15 May 2014 20:31:05 -0700 (PDT) In-Reply-To: <537539AC.8080906@sugarcrm.com> References: <537539AC.8080906@sugarcrm.com> Date: Fri, 16 May 2014 05:31:05 +0200 Message-ID: To: Stas Malyshev Cc: PHP internals Content-Type: text/plain; charset=UTF-8 Subject: Re: [PHP-DEV] on memory usage with the 64bit patch, and interpretation of various numbers From: pierre.php@gmail.com (Pierre Joye) Hi Stas, On Fri, May 16, 2014 at 12:03 AM, Stas Malyshev wrote: > Hi! > >> ### It's The Correct Data Type >> >> The C89 spec indicates in 3.3.3.4 ( >> http://port70.net/~nsz/c/c89/rationale/c3.html#size-95t-3-3-3-4 ) that >> the size_t type was created specifically for usage in this context. It >> is always, 100% guaranteed to be able to hold the bounds of every >> possible array element. Strings in C are simply char arrays. > > Here is my problem with it - we don't need a type that allows to hold > the bounds of every possible array element. It's like buying a house > that could contain all your relatives, acquaintances, friends and people > that you have ever met if they would decide to come to you to stay all > at once. > Too expensive and very impractical. We're using unified string > sizes now, and 99% of the strings we're using never even reach limits of > 16 bits, let alone come close to limits of int. Carrying around a 64-bit > value to store that is just waste. We'd be just storing megabytes of > zeroes without any use. Whatever theoretical reasons there are for > generic application to use that, they hardly can be applied to very > specialized and supposed to be highly optimized case as the language > engine is. It's a fine argument in the generic case, but we do not have > the generic case here. I wonder if one actually reads it correctly and other replies, but let me say it again. It is a side effect not a goal, at all. >> ### It's The Secure Data Type > > "Security" is quickly becoming a thought-terminating cliche, both in > programming and outside. "It's necessary for security", ergo we should > pay whatever it takes for it and not question it. It's not right and we > should not fall into this trap. We can and should question it, and > evaluate the costs carefully. > > In most cases where we deal with possible overflows, 64-bit value can be > overflown just as 32-bit can. Most integer overflows we've had in PHP > were demonstrable on both 32-bit and 64-bit platforms. I disagree here, even more lately. >> This is so important that CERT issued a coding standard for it: >> INT01-C ( https://www.securecoding.cert.org/confluence/display/seccode/INT01-C.+Use+rsize_t+or+size_t+for+all+integer+values+representing+the+size+of+an+object >> ). > > This is a very good generic advice for writing generic functions like > copy() example there. However, again, we don't have generic case here. > We can have our copy()'s and emalloc()'s work with size_t. However, when > we talking about zval, it's a bit different story. I disagree again too, and I am really not alone, if we open our minds, it has been proven that this is the way to go. Not sure what else I could say as we already answered this point numerous times. >> One of the reasons is that it's difficult to do overflow checks in a >> portable way. See VU#162289: https://www.kb.cert.org/vuls/id/162289 . > > I agree, doing it right may be tricky. However, we already have > primitives for doing it that work and are optimized for common > platforms. It is a solved problem. So bringing it as an argument does > not make much sense - we don't need an additional effort to do this > difficult thing, because we've already done it. Of course, these checks > have to be actually used - but they would have to be used in 64-bit case > too! My point exactly, we did not do this job, not even remotely. >> ### About Long Strings >> >> The fact that changing to size_t allows strings (and arrays) to be > >> 4gb is a side-effect. A welcome one, but a side effect none the less. >> The primary reason to use it is that it's the correct data type, and >> gives you the most safety and security. > > Here I must disagree again, as I see inflating the string length > variable as the most unwelcome side effect. Which we may yet find a way > to tolerate it and work around it, but by itself it is nothing but a > drag for anybody but 0.0001% of PHP developers who actually finds it > necessary to stuff 4G strings into PHP. I have to sigh here. Again, we do not care about 4G string, it is a side effect. >> But that's at the structure level. Let's look at what actually happens >> in practice. Dmitry himself also provides these answers. The average >> memory increase is 8% for Wordpress, and 6% for ZF1. These numbers are wrong, I show you and to the list other numbers, which are half of what Dmitry shows. And it is without investigating deeper in phpng, which offers more rooms for improvements and optimization than 5.x, which we use as base for tests, compatibility and performance comparison. How hard it is to understand that we cannot use phpng as base as of now? >> Let's put that 8% in context. Wordpress used 12MB, and now it uses >> 13MB. 1MB more. That's not overly significant. ZF used 29MB. Now it >> uses 31MB. Still not overly significant. > > I think it is pretty significant. If we could reduce memory usage by > 6-8%, would we consider it a win? I think we would. Thus, we should > consider the same increase a loss. However, the bigger loss may be in > inflating the sizes of frequently-used structures like zend_string. It is not 8%. Again. > I think we should look very closely at how we can reduce the memory > impact and not just dismiss it as insignificant. I like the idea of the > patch, and the cleanup of the types and 64-bit support has been long > overdue. However, I would hate to pay for that by dragging literally > megabytes of zeroes around for no purpose but to satisfy an abstract > requirement written for generic case. I agree here. Even with 4%, we can improve it more. And we will. However it seems that some readers take wrong numbers as fact, prototypes as final versions and then decide to shut down a long due change based on these temporary or promising numbers, which will change anyway until phpng gets somewhere near alpha. I am somehow out of arguments to explain that :) Cheers, -- Pierre @pierrejoye | http://www.libgd.org