Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:74209 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 67263 invoked from network); 14 May 2014 18:48:37 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 14 May 2014 18:48:37 -0000 Authentication-Results: pb1.pair.com header.from=nikita.ppv@gmail.com; sender-id=pass Authentication-Results: pb1.pair.com smtp.mail=nikita.ppv@gmail.com; spf=pass; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.220.171 as permitted sender) X-PHP-List-Original-Sender: nikita.ppv@gmail.com X-Host-Fingerprint: 209.85.220.171 mail-vc0-f171.google.com Received: from [209.85.220.171] ([209.85.220.171:37964] helo=mail-vc0-f171.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 22/A0-15285-48AB3735 for ; Wed, 14 May 2014 14:48:37 -0400 Received: by mail-vc0-f171.google.com with SMTP id lc6so3001699vcb.2 for ; Wed, 14 May 2014 11:48:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=FX4JGfmsMaB/AAw0ORsC7aCXEa5E2wZNVBgbAMarqaI=; b=EnEVOC+mT+BT29Hrjv3H6TuRjyCCbLZFrktTXF78/IFblX99uMvGatqJdOwoPk9TnG 4m5QmEyqdzb2T8uFCSNVvmlhX6jtVpublS+KZz4dY26+npZmhAsE4oGR7wtGb6vshjkx pc6Gjoc6IYNKpOoU4I3jVS1knuJxadf8WWrsSRX8CszN7aPbsqsEglflJIYM2dFYM4yG rtyGcSg4tARDsUP/gTpF4hNnP/7uSzuUu3wIslmT1n3nMG/Ljy6c/KyIJLhOFHorLgVZ 7PFL9SyzyeO39qY9Ok0/wZZC9dyP/c8lhNPibHtrt96lU4wPPGpqPu+QMjr+96EfU96U YiiQ== MIME-Version: 1.0 X-Received: by 10.220.103.141 with SMTP id k13mr4209717vco.25.1400093314089; Wed, 14 May 2014 11:48:34 -0700 (PDT) Received: by 10.58.248.4 with HTTP; Wed, 14 May 2014 11:48:34 -0700 (PDT) In-Reply-To: References: Date: Wed, 14 May 2014 20:48:34 +0200 Message-ID: To: Pierre Joye Cc: PHP internals Content-Type: multipart/alternative; boundary=047d7b342d30a3860a04f960a02d Subject: Re: [PHP-DEV] on memory usage with the 64bit patch, and interpretation of various numbers From: nikita.ppv@gmail.com (Nikita Popov) --047d7b342d30a3860a04f960a02d Content-Type: text/plain; charset=UTF-8 On Wed, May 14, 2014 at 6:39 PM, Pierre Joye wrote: > ## Rationale for size_t (string lengths): > > This has significant advantages. There are some costs to doing it, but > they are not as significant as they may appear on the surface. Let's > dive into it: > > ### It's The Correct Data Type > > The C89 spec indicates in 3.3.3.4 ( > http://port70.net/~nsz/c/c89/rationale/c3.html#size-95t-3-3-3-4 ) that > the size_t type was created specifically for usage in this context. It > is always, 100% guaranteed to be able to hold the bounds of every > possible array element. Strings in C are simply char arrays. > Therefore, the correct data type to use for string sizes (which really > are just an offset qualifier) is size_t. > I don't have a copy of the C standard handy, so I'll quote what C++ has to say on size_t: > The type size_t is an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object So a size_t type can contain the size of *any* object. If we want to support *any* object, then we must use the size_t type, that's correct. However, we may reasonably choose to disallow creating objects of certain types (e.g. strings) with sizes that are larger than a uint32_t, to save memory. If we chose to implement such a restriction, the uint32_t type would provide us with the *exact* same guarantees as size_t does - it is large enough to contain the size of any object that we allow to be created. ### It's The Secure Data Type > > size_t (and ptrdiff_t) are the only C89 types that are 100% guaranteed > to be able to hold the size of any possible object that the compiler > will support. Other types will vary depending on the data model that > the compiler supports, as the spec only defines minimum widths. > > This is so important that CERT issued a coding standard for it: > INT01-C ( > https://www.securecoding.cert.org/confluence/display/seccode/INT01-C.+Use+rsize_t+or+size_t+for+all+integer+values+representing+the+size+of+an+object > ). > > One of the reasons is that it's difficult to do overflow checks in a > portable way. See VU#162289: https://www.kb.cert.org/vuls/id/162289 . > In there, they recommend using the C99 uintptr_t type, but suggest > using size_t for platforms that don't have uintptr_t support (and > since we target C89 for the engine, that's out). > Overflow checks are an issue that is independent of the chosen size storage type. They always you need to occur. The addition of two size_t values can overflow in just the same way as the addition of two uint32_t values. There is no difference in security between using size_t and uint32_t. The security issue lies in absence of proper overflow checks (or in the presence of incorrect overflow checks). If anybody can come up with an example where usage of uint32_t over size_t (in a context where we ensured that allocations do not surpass that size) causes security or other safety issues, please speak up. I wasn't able to find anything in that direction. ### ZPP Changes > > The ZPP changes are critical. The reason is that varargs is casting an > arbitrary block of memory to a type, and then writing to it. So > existing code that does zpp("s", str, &int_len) would wind up with a > buffer overflow. Because zpp would be trying to write a 64 bit value > to a 32 bit container. The other 32 bits would fall off the end, into > who knows what. At BEST this can result in a segfault. At worst, > memory corruption and MASSIVE security vulnerabilities. > > Also note that the compiler *can't* and actively doesn't catch these > types of errors. That means that it's largely luck and testing that > will lead to it. > > So, I chose to break BC and rename the ZPP symbols. Because that WILL > error, and provide the developer with a meaningful indication that an > improper data type was provided. As I considered a fatal error that an > invalid type was supplied was a better way of identifying to the > developer that "HEY, THIS NEEDS TO BE CHANGED ASAP" than just letting > them hit random segfaults at runtime. > > If there is a way to get around this by giving the compiler more > information, then do it. But to just leave the types there, and leave > it to chance if a buffer overflow occurs, is dangerous. Which is why I > made the call that the ZPP types **needed** to be changed. > Afaik Johannes has written a Clang-based static analyzer that detects type mismatches between the zpp format string and the passed arguments (gcov also seems to have some kind of analyzer for that). That seems like a good way to find incorrect declarations without having to rename everything - and this also covers checks for all types that weren't changed :) Nikita --047d7b342d30a3860a04f960a02d--