Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:114558 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 79580 invoked from network); 21 May 2021 19:37:55 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 21 May 2021 19:37:55 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 7C4D81804B5 for ; Fri, 21 May 2021 12:48:15 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,PDS_OTHER_BAD_TLD, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: No X-Envelope-From: Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com [209.85.219.181]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Fri, 21 May 2021 12:48:14 -0700 (PDT) Received: by mail-yb1-f181.google.com with SMTP id r8so28976441ybb.9 for ; Fri, 21 May 2021 12:48:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datadoghq.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=SanDJgTZQW7UPnLc0IoHBucXwf2freiAIQDNBstkbUA=; b=S1MmyXXxKYXjQUaJ1WBdpZLD7tsFqp88fNUsFrRLPm2+kgA8ELnX/zA/gxPdceObgc GidcB2nUNmc6wzyMystmGXiq0zGIL066EBc+pIaMnAyly5tYtitumh9DEVnTz6WsBsaR +qxIVXXhQeoFjv1icb0g8p2y9ubJIiS05rxyQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=SanDJgTZQW7UPnLc0IoHBucXwf2freiAIQDNBstkbUA=; b=Dd7KA0tJGE4xCu6eYh8wcaclSDWe2+NQmqA2sqbn5dvMNJHfzQ6Q3fn/YBqyK6x4bs vk5zs+0J1eSrOqeF/txJDujugz6TprRYjlUlFSirMG72HeiN/cJt6t5mgbimQy3Wv13E xp60y8u4wX9fCU7zHbjgjLWavfnO+W2QSvEX9v7uBII6Uh1kTzXJjgYTSf+3vJFpA6Nt JUMY8ziDbJn79wxNKasbdqDYEDVX1CJHTRlf1Zk7dWkPaSghjnd/jtzkBslyzgqBF+T+ 2W4zghIb89H8jJhdGYx9QpvJapOxlRj54Opo6bKypXqzLcnxfCUeLySpRG847Jq8sBGD 8boQ== X-Gm-Message-State: AOAM531Aor+gdwdenOxRYanp4djcFPvEF74Z7/aSzbFPBGLgJMGBeDJJ tysnYUnHjhE2Dn+iFEUU/c+fA/dNuYIDTGUF+q5mOA== X-Google-Smtp-Source: ABdhPJwkIB7hYyMcds6LauTfNku+u3zNMUJmHLOcrcsATbFr2En7Uy47ut86y6KV31nHdNfysg1FiUY4kIlfhyb3ywY= X-Received: by 2002:a25:e803:: with SMTP id k3mr16413613ybd.268.1621626491865; Fri, 21 May 2021 12:48:11 -0700 (PDT) MIME-Version: 1.0 References: <194DA850-19F7-41C4-97CF-944F13B57AFD@cmpct.info> In-Reply-To: <194DA850-19F7-41C4-97CF-944F13B57AFD@cmpct.info> Reply-To: Levi Morrison Date: Fri, 21 May 2021 13:48:01 -0600 Message-ID: To: Calvin Buckley Cc: PHP internals Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [PHP-DEV] Using clang-analyzer with PHP: experiences? From: internals@lists.php.net ("Levi Morrison via internals") On Fri, May 21, 2021 at 1:01 PM Calvin Buckley wrote: > > Hi internals@, > > I maintain an extension and I suspect there are some issues in the code. = As such, I=E2=80=99ve been trying various tools to try to make it easier to= catch the issues. (For the curious: I=E2=80=99ve tried *San, which I feel = doesn=E2=80=99t work very well unless you /totally control/ the entire stac= k, which I didn=E2=80=99t have the luxury of. I also tried Valgrind, but I = need ro revisit this to deal with possible false positives in the library.)= This time, I decided to try static analysis through LLVM. > > Luckily, clang-analyzer is pretty simple. Just prepending =E2=80=9Cscan-b= uild=E2=80=9D to my make invocation. Easy, right? Unfortunately, I noticed = that due to an inconsistency in the codebase (a use of realloc instead of e= realloc), that it doesn=E2=80=99t seem to account for i.e emalloc vs. mallo= c. Possible leaks =E2=80=9Cwent away=E2=80=9D from the output when I conver= ted them to the PHP memory management functions. > > Has anyone ever used clang-analyzer with PHP before? I noticed there was = some tooling for a previous PHP transition [1], but I don=E2=80=99t know if= anyone=E2=80=99s tackled the low-hanging fruit of memory functions. I supp= ose I could just redefine emalloc and friends, but I feel that would probab= ly be inaccurate with things like zend_string. > > Regards, > Calvin > > [1]: https://github.com/johannes/clang-php-checker > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > Just to check: are you setting the environment variable USE_ZEND_ALLOC to 0? This causes the engine to use malloc: https://heap.space/xref/PHP-7.4/Zend/zend_alloc.c?r=3D600402d9#2738. For what it's worth, I was recently annoyed _again_ by valgrind being so noisy because zend_string_equal_val intentionally reads past the end of a zend_string. The allocator ensures that memory was allocated, but it isn't guaranteed to be initialized. We should find some way to initialize this memory for future releases -- maybe add a function which null terminates a zend string by adding not 1 null byte but as many as necessary to reach the end of the allocation. This should be trivial enough in cost to do, compared to some other solutions like always zero'ing out the whole memory block or initializing the trailing bytes at zend_string_alloc time. Also, I'm not sure this read-past-the-end technique is actually safe, such as when USE_ZEND_ALLOC is set to zero and we use malloc directly, which does not make the same guarantees about alignment and padding on the string... Nikita pushed up this change only today, but it would theoretically help with valgrind being used a runtime but not compiled with valgrind support: https://github.com/php/php-src/commit/a0c44fbaf19841164c7984a6c21b= 364d391f3750. I say theoretically only because I haven't tested it yet.