Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125965 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id CBC3D1A00BD for ; Fri, 15 Nov 2024 18:11:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1731694427; bh=LNhVlq/EBUc/fUmW12Is9PE5m3hulymZcCcweHPtx9w=; h=From:Subject:Date:References:To:In-Reply-To:From; b=P96YZYIQhq3HaL4mGGqCabCG238gdh0M7gpVY0zwOfmSDVCI0EqFYYqRjEEay6IIv XfHKAAbV1UNB9oPkJojI4F2U6OJaMKJfL12AuqctgCzN/tF8OmjCWRRQZ/rZw5CJY+ KDIMXCmZUgG7vw3Ll8exeRkxzjVSOqC8VAc3y2D2ORLMlpdEmJ4Kp88v27+7x9Z4AP UOUXdqlyeqG3UGXRczE2zTzIFx3R7Pic1oXMddWfl5NctB29+OLT8q+MhoEcHQGUAQ BSRWqkJtOoe7XoHMK0YLMc28qGN+ucIdm86Jv5sdNq2PAXClDfvs9PmYU+apZH5RTz RDgqhJzKf7pjQ== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 6FCAE1801DE for ; Fri, 15 Nov 2024 18:13:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=0.6 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_50, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS, HTML_MESSAGE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED,RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from sender4-of-o54.zoho.com (sender4-of-o54.zoho.com [136.143.188.54]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 15 Nov 2024 18:13:45 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; t=1731694267; cv=none; d=zohomail.com; s=zohoarc; b=a26cwVJZQdQvwRe6iIbTm8zHojpB5iMgO96fXaUa9SNXjH8MTHrKkkHFo4z5bJxl/8i/wSnGxNS9F7ZgFvxOqMMa6MJuvWHR++1lNrAmT+4mNLtl5X5Na1nXYOwrPpQs1s/oMUfBbTozxSDTJ2lxQ5P9x1ZoYrIamzvdoMoFIU8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1731694267; h=Content-Type:Date:Date:From:From:In-Reply-To:MIME-Version:Message-ID:References:Subject:Subject:To:To:Message-Id:Reply-To:Cc; bh=0f2iAPjvZ36fHoc25q8JAmFa4GnbmTAiqsXjovqF8Zk=; b=mlTm4Zamlb8JeA/LtUWV21RfDWcCIdt6Wa2Gji15SRVbdOm+TzH6rFO8ZEFCv2HMNm0rVlCW466/odQA4U2P5IqGufItCkPat6cKakE7XThLApD/T9BNiDXPlAtuVZO7exc0hIi7RtuZpdRzXNrnsvrkJqwjZwcUKUfKuHc4BiI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=daniil.it; spf=pass smtp.mailfrom=daniil@daniil.it; dmarc=pass header.from= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; t=1731694267; s=daniil; d=daniil.it; i=daniil@daniil.it; h=From:From:Content-Type:Mime-Version:Subject:Subject:Date:Date:References:To:To:In-Reply-To:Message-Id:Message-Id:Reply-To:Cc; bh=0f2iAPjvZ36fHoc25q8JAmFa4GnbmTAiqsXjovqF8Zk=; b=cqhvs+WEgQyuX8cFolK+NGcmpC2sORlnKSUtMrOrTgCBbZ4Id8H2W7YULHaqjMge 7WgZQONlhf/U5GQsGUG6QmBwlX/9sbE4mrommZU/wJk8Med2wgzrdzq7Wmhd8XzM2TU shlETOtMeOplVcZSIKbI8dR2PjGFlbIgOJu75gKU= Received: by mx.zohomail.com with SMTPS id 1731694264505834.1074919726296; Fri, 15 Nov 2024 10:11:04 -0800 (PST) Content-Type: multipart/alternative; boundary="Apple-Mail=_4DEBC501-39C8-4DDE-9FFB-52926228C05F" Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3826.200.121\)) Subject: Re: [PHP-DEV] A new fuzz testing tool for PHP Date: Fri, 15 Nov 2024 19:10:51 +0100 References: <79C53085-9AD8-4E6D-ADAA-38AC1660A57E@gmail.com> To: Yuancheng Jiang <0599jiangyc@gmail.com>, internals@lists.php.net In-Reply-To: <79C53085-9AD8-4E6D-ADAA-38AC1660A57E@gmail.com> Message-ID: X-Mailer: Apple Mail (2.3826.200.121) X-ZohoMailClient: External From: daniil@daniil.it (Daniil Gentili) --Apple-Mail=_4DEBC501-39C8-4DDE-9FFB-52926228C05F Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi Yuancheng, Awesome! I noticed the impressive number of opened issues, and given = your background I guessed you were working on some new fuzzer. I=E2=80=99ve been thinking that the fuzzing corpus of this new fuzzer = (and of the existing oss-fuzz fuzzer SAPI, currently being ran by = Google) may be improved by also adding all of the code of the community = libraries (i.e. those defined by the nightly GitHub workflow at = https://github.com/php/php-src/blob/master/.github/workflows/nightly.yml#L= 485).=20 I understand this might cause issues due to the volume of additional = code being permutated by the fuzzer, but even running the community = tests themselves without fuzzing already uncovers multiple segfaults = (some of which are still being failing on master as of today). Some time ago, I submitted https://github.com/php/php-src/pull/12406, = which does multiple things to improve the coverage of the JIT compiler:=20= - Add a few more popular CPU-intensive community libs like phpseclib, = psalm, phpstan, etc: all prime candidates for addition to the nightly = tests, as they're all extremely CPU-intensive libraries which benefit a = lot from JIT, but suffer from its bugs (i.e. the psalm/phpseclib phpunit = testsuites currently fail with segfaults with tracing JIT, phpseclib = even explicitly disables JIT on windows to avoid a yet unsolved bug, = etc...). - Parallelise community tests using a custom new runner, addressing = concerns from Ilya which was worried about CI run times caused by the = addition of new community tests - Add --repeat 2 to all tests (including community tests), which manages = to catch some nasty JIT bugs by re-invoking the same script twice = without actually recompiling the PHP code and zend byte code twice, = managing to find issues caused by side effects of the first compilation. - Improve JIT flags of community and phpt tests to detect more JIT bugs, = mainly copying them from = https://github.com/danog/jit_bugs/blob/master/php.ini I don=E2=80=99t currently have the free time to clean up the pull = request and fix the numerous JIT bugs it currently detects (at least = without a support contract), but all of these approaches may be reused = if you or anyone else decides to upstream FlowFusion (though I would = love at least a @danog mention in the pull request :) Regards, Daniil Gentili =E2=80=94 Daniil Gentili - Senior software engineer=20 Portfolio: https://daniil.it Telegram: https://t.me/danogentili > On 15 Nov 2024, at 14:20, Yuancheng Jiang <0599jiangyc@gmail.com> = wrote: >=20 > Hi all, >=20 > I have been submitting hundreds of bugs (see = https://github.com/php/php-src/issues/created_by/YuanchengJiang) during = the past months and I first thank all the developers who take time to = fix these issues to make PHP better. >=20 > I am thrilled to introduce one fully automated fuzz testing tool, = FlowFusion, for discovering various bugs of the PHP interpreter. >=20 > The core idea behind FlowFusion is to leverage dataflow as an = effective representation of test cases (.phpt files) maintained by PHP = developers, merging two (or more) test cases to produce fused test cases = with more complex code semantics. We connect two (or more) test cases = via interleaving their dataflows, i.e., bringing the code context from = one test case to another. This enables interactions among existing test = cases, which are mostly the unit tests verifying one single = functionality, making fused test cases interesting with merging code = semantics. >=20 > FlowFusion additionally fuzzes all defined functions and class methods = using the code contexts of fused test cases. Available functions, = classes, and methods are pre-collected and stored in sqlite3 with = necessary information like the number of parameters. FlowFusion will be = automatically upgrading if phpt files keep updating. Any new single test = can bring thousands of new fused tests. >=20 > The search space of FlowFusion is huge, which means it can cover = various corner cases. Reasons for the huge search space are three-fold: = (i) two random combinations of around 20,000 test cases can generate = 400,000,000 test cases, and we can combine even more; (ii) the = interleaving has randomness, given two test cases, there could be = multiple ways to connect them; and (iii) FlowFusion also mutates the = test case, fuzzes the runtime environment/configuration like JIT. >=20 > I can open-source the tool under my personal repository. I wonder by = any chance if I can contribute it as the official PHP tool under = https://github.com/php, and I would be happy to maintain it for a long = time. >=20 > Best, > Yuancheng --Apple-Mail=_4DEBC501-39C8-4DDE-9FFB-52926228C05F Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Hi Yuancheng,
Awesome! I noticed the = impressive number of opened issues, and given your background I guessed = you were working on some new fuzzer.

I=E2=80=99ve= been thinking that the fuzzing corpus of this new fuzzer (and of the = existing oss-fuzz fuzzer SAPI, currently being ran by Google) may be = improved by also adding all of the code of the community libraries (i.e. = those defined by the nightly GitHub workflow at https://github.com/php/php-src/blob/master/.github/workflows/= nightly.yml#L485). 

I understand this = might cause issues due to the volume of additional code being permutated = by the fuzzer, but even running the community tests themselves without = fuzzing already uncovers multiple segfaults (some of which are still = being failing on master as of today).

Some time = ago, I submitted https://github.com/php/= php-src/pull/12406, which does multiple things to improve the = coverage of the JIT compiler: 

- Add = a few more popular CPU-intensive community libs like phpseclib, psalm, = phpstan, etc: all prime candidates for addition to the nightly tests, as = they're all extremely CPU-intensive libraries which benefit a lot from = JIT, but suffer from its bugs (i.e. the psalm/phpseclib phpunit = testsuites currently fail with segfaults with tracing JIT, phpseclib = even explicitly disables JIT on windows to avoid a yet unsolved bug, = etc...).

- Parallelise community tests using a custom new = runner, addressing concerns from Ilya which was worried about CI run = times caused by the addition of new community = tests

- Add --repeat 2 to all tests (including = community tests), which manages to catch some nasty JIT bugs by = re-invoking the same script twice without actually recompiling the PHP = code and zend byte code twice, managing to find issues caused by side = effects of the first compilation.

- Improve JIT = flags of community and phpt tests to detect more JIT bugs, mainly = copying them from https://git= hub.com/danog/jit_bugs/blob/master/php.ini


I don=E2=80=99t currently have the free time = to clean up the pull request and fix the numerous JIT bugs it currently = detects (at least without a support contract), but all of these = approaches may be reused if you or anyone else decides to upstream = FlowFusion (though I would love at least a @danog mention in the = pull request :)



Regards,
Daniil = Gentili

=E2=80=94

Daniil Gentili - Senior software = engineer 

Portfolio: https://daniil.it
Telegram: https://t.me/danogentili

On 15 Nov 2024, at 14:20, = Yuancheng Jiang <0599jiangyc@gmail.com> wrote:

Hi all,

I have been submitting hundreds of bugs = (see h= ttps://github.com/php/php-src/issues/created_by/YuanchengJiang) = during the past months and I first thank all the developers who take = time to fix these issues to make PHP better.

I am thrilled to introduce one fully automated fuzz testing = tool, FlowFusion, for discovering various bugs of the PHP = interpreter.

The core idea behind FlowFusion is to leverage dataflow as an = effective representation of test cases (.phpt files) maintained by PHP = developers, merging two (or more) test cases to produce fused test cases = with more complex code semantics. We connect two (or more) test cases = via interleaving their dataflows, i.e., bringing the code context from = one test case to another. This enables interactions among existing test = cases, which are mostly the unit tests verifying one single = functionality, making fused test cases interesting with merging code = semantics.

FlowFusion additionally fuzzes all defined functions and class = methods using the code contexts of fused test cases. Available = functions, classes, and methods are pre-collected and stored in sqlite3 = with necessary information like the number of parameters. FlowFusion = will be automatically upgrading if phpt files keep updating. Any new = single test can bring thousands of new fused tests.

The search space of FlowFusion is huge, which means it can = cover various corner cases. Reasons for the huge search space are = three-fold: (i) two random combinations of around 20,000 test cases can = generate 400,000,000 test cases, and we can combine even more; (ii) the = interleaving has randomness, given two test cases, there could be = multiple ways to connect them; and (iii) FlowFusion also mutates the = test case, fuzzes the runtime environment/configuration like = JIT.

I can open-source the tool under my = personal repository. I wonder by any chance if I can contribute it as = the official PHP tool under https://github.com/php, and I would = be happy to maintain it for a long time.

Best,
Yuancheng

= --Apple-Mail=_4DEBC501-39C8-4DDE-9FFB-52926228C05F--