Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:125942 X-Original-To: internals@lists.php.net Delivered-To: internals@lists.php.net Received: from php-smtp4.php.net (php-smtp4.php.net [45.112.84.5]) by qa.php.net (Postfix) with ESMTPS id 53C261A00BD for ; Mon, 11 Nov 2024 19:12:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=php.net; s=mail; t=1731352495; bh=rQKY0jZI2d94Wo7VG2s+MPQzmlfVLdll385Dt8rIdAU=; h=Date:From:To:In-Reply-To:References:Subject:From; b=oNr++5uphKtw9xc8Z/7bJ1gcCy3GRcmqgrAhWC6uVlI1B8Eei6KbArMmcLomOGTi+ mnM6HSEEpqxUTIG4LcDDuI1WGsR4oLLptnWKcDSqyjmzAlmu0lfpScaIO5u6UZ9sNU KFfIdYMWdwPrQrXsQT9iW0ChK0ag+ymPrdFBjoNyOmK6T+l1I7Bikp++ApdiniNJJg BfhYFL8EIcZUTLhrqadArxW/KjoafmP3fUkmFFw2t1zHbqSL/0Z7WTX4Gi6fnlmjmB 7+96+lIPPyHfGkCLYF0hk6XR09NWasXUpwy6OfAyyLqo9VcdVmw+qCjvZjb72LuMiX 1hR90JEAcf+9A== Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 09FB7180037 for ; Mon, 11 Nov 2024 19:14:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 4.0.0 (2022-12-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.1 required=5.0 tests=BAYES_50,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,DMARC_MISSING,RCVD_IN_DNSWL_LOW, SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=4.0.0 X-Spam-Virus: No X-Envelope-From: Received: from fout-b2-smtp.messagingengine.com (fout-b2-smtp.messagingengine.com [202.12.124.145]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 11 Nov 2024 19:14:54 +0000 (UTC) Received: from phl-compute-01.internal (phl-compute-01.phl.internal [10.202.2.41]) by mailfout.stl.internal (Postfix) with ESMTP id 279F81140186 for ; Mon, 11 Nov 2024 14:12:19 -0500 (EST) Received: from phl-imap-06 ([10.202.2.83]) by phl-compute-01.internal (MEProxy); Mon, 11 Nov 2024 14:12:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= garfieldtech.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to; s=fm3; t=1731352339; x=1731438739; bh=JWRUMOmENEc6f7iFYyTcF /nET22q6W6r/ngJ2lP3Agw=; b=xLQ2lXISzCiDyeflaTZDNZeQlL5OoJqi041cS VGNONpvXXZi5W5k/VfWJ/IWRGHVCQEykmddynZzBSWjS1r15qtJ9ClEJKNuLsW+d bdUY202O4SKtiEpTWviWXiGQscb4Ul+4nd7zxczuBVaJ6vM2uk8Sayj4A2a+t/9H ebuIIFc1j41vN5FZS8Mg2uR3Wa+T+Mg8lIzckEZJE5X/fd1u2oE1O8jjBMiQgz1X u689MTf58Hw7qa9Eec9ii6UiVPEYoS+DJesYzmlj0iS8jBD9L332miDJtuE072ix purBMZmHRLdma5ktxUyX2jLb5gWGS5FVrt9w48x04e+39rKJA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; t=1731352339; x=1731438739; bh=J WRUMOmENEc6f7iFYyTcF/nET22q6W6r/ngJ2lP3Agw=; b=hIoIUw8gD6aBX4Ln+ 6EQjyGTr9RUOZJGsvfz0Fpi9qhfpYwZRBQXmMtLttnleJ9KExRf4o8GHNVAMVHNE 0t5MGcj4xtCHz7iuJ+QAosZtCWXVbuW92Zm5LZZyM/1w3vrLZHX0hdK9WuMWYWSD 1CcBMzM1UUTd/+t60U4d+S4StW34htoeMBQEpon/bDTdi4Nicmf4UdnsR4VOo0Kh gGXS6dvtnOUbbHnBZs73S1Vb9PIz0TfLMbZneHqem2oIHCUFrzVBWpSwLhOuWsq0 vpGLHM1YVEeVhFP52YIqOA+JFFyLitXygYwV0bAexgMNXMlpRL/c9CorPpbYm2VA 3EGUw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefuddruddvgdduvdduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdggtfgfnhhsuhgsshgtrhhisggvpdfu rfetoffkrfgpnffqhgenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnh htshculddquddttddmnecujfgurhepofggfffhvffkjghfufgtgfesthejredtredttden ucfhrhhomhepfdfnrghrrhihucfirghrfhhivghlugdfuceolhgrrhhrhiesghgrrhhfih gvlhguthgvtghhrdgtohhmqeenucggtffrrghtthgvrhhnpeevfeejvdetffehiefgueei feefudevfefffefhudeuleelkedtjeetteehveevgeenucffohhmrghinhepghhithhhuh gsrdgtohhmpdhphhhprdhnvghtnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghm pehmrghilhhfrhhomheplhgrrhhrhiesghgrrhhfihgvlhguthgvtghhrdgtohhmpdhnsg gprhgtphhtthhopedupdhmohguvgepshhmthhpohhuthdprhgtphhtthhopehinhhtvghr nhgrlhhssehlihhsthhsrdhphhhprdhnvght X-ME-Proxy: Feedback-ID: i8414410d:Fastmail Received: by mailuser.phl.internal (Postfix, from userid 501) id 9E3E729C006F; Mon, 11 Nov 2024 14:12:18 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface Precedence: bulk list-help: list-post: List-Id: internals.lists.php.net x-ms-reactions: disallow MIME-Version: 1.0 Date: Mon, 11 Nov 2024 13:11:34 -0600 To: "php internals" Message-ID: In-Reply-To: <3963499a-c9ac-4cbe-b40d-44d62a9240d2@jnvsor.net> References: <55320aad-758a-4d06-b1bd-3eac2b5a5f71@app.fastmail.com> <3963499a-c9ac-4cbe-b40d-44d62a9240d2@jnvsor.net> Subject: Re: [PHP-DEV] [RFC] PHP.net analytics Content-Type: text/plain Content-Transfer-Encoding: 7bit From: larry@garfieldtech.com ("Larry Garfield") On Tue, Nov 5, 2024, at 3:46 PM, Jonathan Vollebregt wrote: > For the first there's a user agent (Again, matomo-php-tracker) as well > as media queries for transparent tracking with or CSS The browser user-agent is widely recognized as basically useless in the vast majority of cases. Most browsers load so much crap in there to try and emulate each other that it rarely tells you anything useful, in addition to being trivially spoofable. > Transparency is a big deal. Server side analytics are ok because PHP > devs know what goes into an HTTP request. (And it's fairly limited in > scope by definition) We don't know what goes into a request sent from a > black box blob of minified JS. Matomo was selected precisely for this reason. It's GPLv3 licensed. 100% of the code is available to review and audit. Here's the unminified JS code: https://github.com/matomo-org/matomo/blob/5.x-dev/js/piwik.js It cannot get more transparent than that. Using the server-side library would be no more transparent. Using log ingestion would be no more transparent. Potentially it would be less. > If your JS just consisted of `if(wasm) fetch()` I would be fine with > that, but it's actually a 66kb minified JS file. > > Perhaps you could just start with server side tracking and see how it > goes? I'd be much happier with client side tracking in future if it's > voted on one metric at a time rather than a big opaque file. Just to nip this part in the bud: RFCs for any config change on the servers is a doomed idea that should never even be considered. Infra-RFCs are very rare, and they should be. Infra should by and large be handled by dedicated people, not by direct democracy. Eg, the move to GitHub issues was an RFC (https://wiki.php.net/rfc/github_issues), but tweaks to, say, issue templates or permissions or other configuration have not gone through an RFC, nor should they. We have looked into Matomo's server library. It's potentially useful, but it doesn't give the same data that a client-side tracker would. They'd give overlapping but distinct information, so it's potentially useful to have both. That said, it would also require integrating into the server-side PHP code for the website, and triggering IO (database calls at least) in the web process. That can only slow down the page loading process. That would in turn mean we should really make better use of HTTP caching (which we currently do not use at all for HTML pages), which would in turn make server-side metrics even less reliable. (I'm of the mind that we should be aggressively caching pages anyway, especially as pages are virtually static in practice, but that's a separate matter.) Really, the order of ease for the various collection mechanisms is: 1. Client-side JS 2. Server component 3. Log ingestion So saying "start with the hard one, then maybe do the easy one" is frankly backwards, and just creates more work and problems. To those that seem uncomfortable about using a JS-based metrics tool, I need to ask... why? I have yet to see anyone put forward a practical reason why JS-based metrics are bad. Metrics that go into a black box with a 3rd party are bad. That's not what is being proposed. Metrics that collect PII are bad. That's not what is being proposed. Metrics that collect unnecessary telemetry for advertising, etc. are bad. That's not what is being proposed. Closed source/non-free code is bad. That's not what is being proposed. I also run an ad blocker myself, and I share the general concern about the enshittification of the Internet through advertising stalkers. That's not what is being proposed. The tools being proposed are precisely to avoid that. (It would be easier still to just toss Google Analytics on the site and be done with it, but we're very deliberately not doing that.) So what practical, non-knee-jerk reason is there why the easiest to implement, easiest-to-onboard-people, least-unreliable-data option is not the best solution? Serious question, because I cannot think of one. --Larry Garfield