Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:96084 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 96957 invoked from network); 21 Sep 2016 22:47:42 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 21 Sep 2016 22:47:42 -0000 Authentication-Results: pb1.pair.com smtp.mail=smalyshev@gmail.com; spf=pass; sender-id=pass Authentication-Results: pb1.pair.com header.from=smalyshev@gmail.com; sender-id=pass Received-SPF: pass (pb1.pair.com: domain gmail.com designates 209.85.218.43 as permitted sender) X-PHP-List-Original-Sender: smalyshev@gmail.com X-Host-Fingerprint: 209.85.218.43 mail-oi0-f43.google.com Received: from [209.85.218.43] ([209.85.218.43:35224] helo=mail-oi0-f43.google.com) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 80/BF-04117-D0E03E75 for ; Wed, 21 Sep 2016 18:47:42 -0400 Received: by mail-oi0-f43.google.com with SMTP id w11so78022788oia.2 for ; Wed, 21 Sep 2016 15:47:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=zjihZWvQxq7hGenPQz45e5loJcIgsdKunl3YadVRDko=; b=nbUOLI4DqOXPcdj54zsUJ8Cpo1EYu3Sw+VURh/oUCdsfyh0lNUKC257Cc9Zol4MoXq 46lAxLst4/iuQ/1EC3HnM6K8rv6TqyZp+5p7lXY0gPxZJh0b8d7ev++hgZAXI2ERFWCG UAbrITWP+Ss79ARJr8D+iqKzfbhatYCx2hQ5qnUCgQVsImz8NhYn0S6uycxPcB/2Pa7p waSx4ItJMGanZ5ft5uOq3Wx6t4OLXWGgTBilaYthTvEbWQtW1MdlWNVS+FIOGSgRy7x/ bydy9Jw5cSCdIbJbzWX/8LXNLeSlPCPGcFieNucl4qBiFjPCSiqrihJYMgXWdcjltpl8 sz8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=zjihZWvQxq7hGenPQz45e5loJcIgsdKunl3YadVRDko=; b=HETVPXhYoUgESOrhnl64DwguOltB0O+KOw81fS0qvaE+tsZ2dHYpmSGrwphgPhS0oI NfstoTkWoPn4NSG5mN6gbp2kEedzbJaE5JeSczXmS6QlMXXcCWBeCpw1T1XkwRy5MQwH +Tfj4YHXrYqQxsfgk7t2KAFHcxiwTNZZJC5vXcf5Wyqwq8dwVacyIi+Kw3anfZAY2YjZ Ze3QCHfLtn8iUiCCaj+hyM+a+NGpNg/W2H4AGCE7D43SzCDe54zcaOO/crBowe3Qc+Qi Dy71O3mI9GhPT2xXqlgoQ9cyFVdxcRDZKaL5AJA3mWMn1TfDcj5vUxaKAeq3oYa4Bdnc lsKw== X-Gm-Message-State: AE9vXwP2i+VQ5o7mNvx7R70Wtq1BZ2bq5z8bpziB4cbbpip1JrQji08KtAQcU4W40J2tgw== X-Received: by 10.202.69.11 with SMTP id s11mr51814292oia.55.1474498059409; Wed, 21 Sep 2016 15:47:39 -0700 (PDT) Received: from [192.168.2.102] (108-233-206-104.lightspeed.sntcca.sbcglobal.net. [108.233.206.104]) by smtp.gmail.com with ESMTPSA id c6sm8787708otb.16.2016.09.21.15.47.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 21 Sep 2016 15:47:38 -0700 (PDT) To: Yasuo Ohgaki References: <7d5727ba-da33-e3c5-1d1f-318c45d81616@cubiclesoft.com> <9522ebc9-8d8b-045e-b701-02f1166063e6@gmail.com> Cc: Scott Arciszewski , Thomas Hruska , PHP Internals Message-ID: <7642ae28-a347-653a-026a-2fb0fa613f85@gmail.com> Date: Wed, 21 Sep 2016 15:47:37 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Subject: Re: [PHP-DEV] HashDoS From: smalyshev@gmail.com (Stanislav Malyshev) Hi! > Hi Stas, > > On Wed, Sep 21, 2016 at 11:26 AM, Stanislav Malyshev > wrote: >> >>> I think we are better to limit max collisions. >>> I'm +1 for Nikita's proposal does this. >> >> Max collision per what? How much would be the limit? > > Collision by keys. Not sure I understand. What would be counted - number of collision per key? Per hashtable? Per process? Per request? > It would be nice to have configurable limit like regex stack/backtrack limit. > That said, wouldn't 1000 enough for almost all apps? Certainly not. Not even nearly enough. Collisions are pretty frequent with short strings, for example, and for a big long-running application 1000 hash collisions is nothing. I think you severely underestimate how frequent hash collisions are, with simple function like we're using, over millions and millions of hash accesses that we're doing routinely. I did a quick check, and just running run-tests.php -h (without any tests!) produces about 5K collisions. Running composer (without doing anything) - 8K collisions. Running composer update on a simple project - 400K (!) collisions. Now these are pretty simple cases compared to what a complex modern PHP application does. So I think you are underestimating it by about 4-5 orders of magnitude. > > Anyway, we have two choices > - Simply limit the number of collisions. (Fast and has no impact to code) Fast: true, no impact: not so much. If the limit is too low for your app (and you probably have no idea how many collisions your app has, and it is probably highly varied too) your app breaks in some very creative ways, including dropping dead in the middle of transactions, being unable to handle errors, etc. > - Use crypt safe hash and salt. (Slow and has impact to opcache/etc) These not the only two choices. > Limiting something is good to have sometimes. > Python even limits number of recursions to 1000 by default. Recursion limits are completely different. With recursion, with any proper algorithm, the deeper you go the probability of having next recursion step drops dramatically (usually exponentially - that's how you build performant algorithms). However, hash collisions are completely independent, so having one does not change the chance you have another (in fact, it may make it higher if the same data is processed again). Which means, for recursion, it is very probable there is a limit that most of your code will never reach. Even then we are cautious of it as there's no good number for a natural limit. But with hash collisions, I just see no way to tell "10K collisions is enough for anything" and at least my quick checks - unless I massively messed up the test - shows that it's not easy to find a practical limit, at least with current hash function. > We have PCRE stack/backtrack limits. (We'll have mbregex stack limit soon) > Collision limit is good one also. Does not follow, as per above. -- Stas Malyshev smalyshev@gmail.com