Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:96085 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 99212 invoked from network); 21 Sep 2016 23:07:18 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Sep 2016 23:07:18 -0000 Authentication-Results: pb1.pair.com smtp.mail=smalyshev@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=smalyshev@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.218.47 as permitted sender) X-PHP-List-Original-Sender: smalyshev@gmail.com X-Host-Fingerprint: 209.85.218.47 mail-oi0-f47.google.com Received: from [209.85.218.47] ([209.85.218.47:34331] helo=mail-oi0-f47.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 06/10-04117-6A213E75 for ; Wed, 21 Sep 2016 19:07:18 -0400 Received: by mail-oi0-f47.google.com with SMTP id a62so78545665oib.1 for ; Wed, 21 Sep 2016 16:07:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=X5rsbQP+0nOGaZ7giyoxoTAfetMuBe8ZOZsReK9RyfQ=; b=Cr1trMn9lskF2KOd0i1OMWh2X2r14FSk1meqZpjagewRhlLucqZwlD1Kz5MoRNnHhb PVVl5olxbnUdiy9McPbmf+yAHruzGbXgz0/p0flSd3RgAo6NPgffbcAtTWrTQfMr7qiZ KrhqhrIqEM9XB5IGrUdcyxBxCZ4/mFTYthzlojaFLAY/HJRWIPi1RytDllPlvYhNT911 sz5eYBGeUm7iw0HVSJVJsrIVed14agKVKW7ZqRIzJF6lEQEIT3ypCF4sYYAGjPGMoCIS xvPfXyzumrDcaPZVa9vISFgI2z9joeXfrT/ekWotY1/ZqtwsHArCbYgGV3WUeUvfoqRt hd1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=X5rsbQP+0nOGaZ7giyoxoTAfetMuBe8ZOZsReK9RyfQ=; b=axzM/vIThylA+M/+k1Taj6vnm4T5iVQd7xhSM09tcUXaXsQoP0x8tbINehzJF2nQRI 3B9y4+QpTC+ykCmDKe8f9FjdWnDNqRlLUHG0JhjrrbFyf83pnphf9Cai/pdxunuwaoiK 3PUmieqrF3AUIFcLsW84w8onG3L7sJJvhsmmyHpCNU0tGIcNnM1qFk35NIsTERmAo4X0 xfLI6uYgi8GpVLPrYrpUV7iqIFD1IIzAq3SrNjCna4lUBm4JfjSZDIr4ZkQRdGORtp/N vYVXp4p6B0/DPxd4JK0CAjx6vsbfUPqIk39N9QZswLMnsHfaarLqpz0c6JMsvkvPPhgK JYzg== X-Gm-Message-State: AE9vXwN/EeieAc0MiBsbKo8POtHtgHQlIK/hDAwHatsSw1BPpOGogeOH+DwkWYvhdOBRrw== X-Received: by 10.202.187.215 with SMTP id l206mr3802918oif.173.1474499234928; Wed, 21 Sep 2016 16:07:14 -0700 (PDT) Received: from [192.168.2.102] (108-233-206-104.lightspeed.sntcca.sbcglobal.net. [108.233.206.104]) by smtp.gmail.com with ESMTPSA id c4sm62880oia.28.2016.09.21.16.07.14 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Sep 2016 16:07:14 -0700 (PDT) To: Nikita Popov References: <7d5727ba-da33-e3c5-1d1f-318c45d81616@cubiclesoft.com> <9522ebc9-8d8b-045e-b701-02f1166063e6@gmail.com> <40868951-8BDA-4860-884C-B8252C1839E3@gmail.com> Cc: PHP Internals Message-ID: Date: Wed, 21 Sep 2016 16:07:13 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] HashDoS From: smalyshev@gmail.com (Stanislav Malyshev) Hi! >> Lets [try] to quantify the probability of reaching the collision limit C > with a hashtable of size N and assuming a random hash distribution. The > upper bound for this should be (N choose C) * (1/N)^(C-1), with (1/N)^(C-1) > being the probability that C elements hash to one value and (N over C) the > number of C element subsets of an N element set. For large N and N >> C we > approximate (N choose C) to (e*N/C)^C / sqrt(2pi*C). As such our upper > bound becomes N * (e/C)^C / sqrt(2pi*C). Choosing N = 2^31 (largest > possible hashtable) we get for C = 20 probability 8.9E-10 and for C = 100 > probability 2.3E-149. The patch uses C = 1000. I assume you're talking here about per-hashtable limit? Here I notice two things: 1. Hash keys are not really random. Which can both improve and make it worse, but we have a lot of keys named "key" and not a lot of keys named "\x07\xF5\xDD". 2. This does not account for the fact that if we have two keys used by app colliding, we'll probably see this collision a lot. Of course, if we're counting per-hashtable, that would make is somewhat less, but still. To validate this theory, I applied the following patch: https://gist.github.com/smalyshev/2c3c085998ba81d6c163a27386aaacd8 And run composer update on one of my projects. I got these lines: Destroyed hash with coll = 48454 Destroyed hash with coll = 41871 Destroyed hash with coll = 41828 Destroyed hash with coll = 40535 Destroyed hash with coll = 32774 Destroyed hash with coll = 22782 Destroyed hash with coll = 21381 Destroyed hash with coll = 15232 Destroyed hash with coll = 14745 Destroyed hash with coll = 12216 Destroyed hash with coll = 11742 Destroyed hash with coll = 11625 Destroyed hash with coll = 8916 Destroyed hash with coll = 6820 Destroyed hash with coll = 5911 Destroyed hash with coll = 5429 Destroyed hash with coll = 3390 Destroyed hash with coll = 1188 I.e. there were 18 hashes with more than 1000 collisions on the course of the run. Is there something wrong with my count or am I counting wrong things (I didn't see the patch for limiting counts so maybe I'm still counting wrong things). > In other words, it is extremely unlikely that you hit the collision limit > by accident, with non-malicious data. So no, the user does not have to > discriminate normal from malicious causes. If the limit triggers, it's > malicious. I'm not so sure of this now - unless, again, I misunderstand what we're counting. -- Stas Malyshev smalyshev@gmail.com