Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:116260 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 3571 invoked from network); 11 Oct 2021 09:48:29 -0000 Received: from unknown (HELO php-smtp4.php.net) (45.112.84.5) by pb1.pair.com with SMTP; 11 Oct 2021 09:48:29 -0000 Received: from php-smtp4.php.net (localhost [127.0.0.1]) by php-smtp4.php.net (Postfix) with ESMTP id 874EE180504 for ; Mon, 11 Oct 2021 03:34:23 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp4.php.net X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: AS15169 209.85.128.0/17 X-Spam-Virus: No X-Envelope-From: Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by php-smtp4.php.net (Postfix) with ESMTPS for ; Mon, 11 Oct 2021 03:34:23 -0700 (PDT) Received: by mail-pg1-f177.google.com with SMTP id s11so10403992pgr.11 for ; Mon, 11 Oct 2021 03:34:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=I3TEZTvIfXdptj2cvEfp797+t5e+K5eAx1FuuBVyEzs=; b=fqb5KITuMXq8zPGxggYlWIJyZnwNb+pNP7ianVXZWrwj8i2tRhiafi0jUsADstTZxX w2lUB1sCaPfD/GMxn6EsHkWs5WIeggKtanPHAfZ447aVmWzjxfBNw9XM80J6aTr4YMIU 8eTzpUkki48ghjDgGqeE81KMXWvlqoow5hZtFB0nMt1DYpeqCoGgp1H9J3KmibGCPol7 ChHTN9bOjEvRa1vo6X6acC2XesVLyN5JeqNHrGRDkxg1y2HutnzuYYixJ5H0hu/OzHCo 79956ODil9yhjyFtmgx5piOAfD64fVFy6J7CWY3+Uz4ozNBVQ1dsLkS4slTXHOen0dN5 +XdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=I3TEZTvIfXdptj2cvEfp797+t5e+K5eAx1FuuBVyEzs=; b=BdQFtsPeoTSLqBkzMkeGSCI3h7jVaIMsrJywNBry1nF0M361nLJ5HdlqhDNbKOR0OB 5YBlpDX8bOx/6ODnJCwnG6yYx0CzeFxaPYofLGUYExL07a2eE6t4PQlbRAtVcXBuM0nI 278jmgo71+1MZYy2RP/YQM1UrOmYEp9LuSjUg2/O4iSXcyPC4qJFEL5IwH3p21bUa/xN /es9vAlJI1czBDif5WeNU9FNLZgzYYw5C2bj25T1MSeGAxmyyYgXr+Y72hIRej4vPZkC vnE9pS12C2DivQk+2uClGB2KZxFlnDo4qiM88gsFIqZoEKs7Z4QMxku84+erqIvpGXxu SCgw== X-Gm-Message-State: AOAM531D9hYMYlMjlO0RzZQC/W+s4JAbvd5r7sBJVh4r0aiOjeLycFP1 CEIHrCPwJYiM1B9J5zc3hrpzn35j1tGepn+zMWo= X-Google-Smtp-Source: ABdhPJxWYhUZRmPjKotPtiu3y2/+bVZKQXi7ep0Wn0ZchVRTu8BHKkoRt35afFnxhTubf6eFhxzX7qJnHbrrtigdfYA= X-Received: by 2002:a05:6a00:16cb:b0:44b:bd38:e068 with SMTP id l11-20020a056a0016cb00b0044bbd38e068mr25016681pfc.34.1633948461925; Mon, 11 Oct 2021 03:34:21 -0700 (PDT) MIME-Version: 1.0 References: <88b5171e-48b3-0176-47de-ee1499832b57@wikimedia.org> In-Reply-To: Date: Mon, 11 Oct 2021 12:33:46 +0200 Message-ID: To: Nicolas Grekas , PHP internals Content-Type: multipart/alternative; boundary="000000000000a789b305ce1145bd" Subject: Re: [PHP-DEV] [RFC] Locale-independent case conversion From: divinity76@gmail.com (Hans Henrik Bergan) --000000000000a789b305ce1145bd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable @Nicolas i hope mediawiki doesn't run on Windows, because that escapeshellarg-replacement you did is not valid for windows. the code prints: echo 'foo && whoami && echo ' and when i run that in bash i get: C:\Users\hansh>echo 'foo && whoami && echo ' 'foo laptop-1plmku02\hansh ' - whoami was executed even though it ran through mediawiki's escapeshellarg replacement PS creating a proper escapeshellarg for windows is actually difficult, see https://docs.microsoft.com/en-gb/archive/blogs/twistylittlepassagesallalike= /everyone-quotes-command-line-arguments-the-wrong-way (and even php's upstream escapeshellarg() doesn't get it quite right on Windows, corrupting more data than it strictly needs to corrupt, but i don't recall the specifics) On Mon, 11 Oct 2021 at 09:38, Nicolas Grekas wrote: > Le lun. 11 oct. 2021 =C3=A0 03:33, Tim Starling = a > =C3=A9crit : > > > On 4/10/21 9:08 pm, Nikita Popov wrote: > > > > > > Hi Tim, > > > > > > Thanks for creating this proposal, it looks great! > > > > > > I think this is a very beneficial change, and the amount of > > > incorrect locale-dependent calls we had just in php-src further > > > convinced me of this: We're generally aware of the problem, and we > > > still made this mistake. Many times. > > > > > > The only open question I have is regarding the ctype_* functions. > > > One might argue that these functions should be locale-independent as > > > well. Certainly, whenever I have used ctype_digit() I only intended > > > it to match [0-9]. It seems like some people try to use > > > ctype_alpha() in a locale-sensitive way > > > ( > > > https://stackoverflow.com/questions/19929965/php-setlocale-not-working-fo= r-ctype-alpha-check > > > < > > > https://stackoverflow.com/questions/19929965/php-setlocale-not-working-fo= r-ctype-alpha-check > > >) > > > and then fail because it doesn't support UTF-8. > > > > > OK, I removed ctype_tolower() and ctype_toupper() from the RFC and the > > PR since they would be incompatible with a move towards a > > locale-independent ctype extension. > > > > The non-controversial parts of the PR were split and merged, so I > > rebased the PR and updated the RFC accordingly. > > > > Do you think the RFC is ready for voting now? > > > > > > > PS: Regarding escapeshellarg(), are you aware of the array command > > > support for proc_open() that was added in PHP 7.4? That does away > > > the need to escape arguments. > > > > It doesn't really help us. I recently wrote a new shell command > > execution system for MediaWiki called Shellbox. As part of that > > project, I reviewed how shell execution is used in the MediaWiki > > ecosystem. There are a lot of callers which are using shell features, > > for example redirecting inputs or outputs, or constructing pipelines. > > I didn't really want to break them all or reimplement those features > > without the shell. And we have security and containerization wrappers > > which depend on construction of a shell command string. So we need to > > be able to construct shell command strings safely. > > > > After studying locale sensitivity for this RFC, I decided to get rid > > of escapeshellarg() from MediaWiki. Instead we are doing our own shell > > escaping: > > > > https://gerrit.wikimedia.org/r/c/mediawiki/libs/Shellbox/+/722548 > > > > I also made MediaWiki use a fixed locale, instead of being configurable= . > > > > Hi Tim, > > thanks for the RFC and for the above pointers, I'm going to have a look a= t > Symfony Process to follow your lead! > > About the RFC, I just have one note: > > > I didn't include strnatcasecmp() and natcasesort() in this RFC, because > they also use isdigit() and isspace(), and because they are intended for > natural language processing. They could be migrated in future. > > Despite their name, I never used *natcase* functions for natural language > processing. I use them eg to sort lists of files in a directory, to accou= nt > for numbers mainly. But that's not what I would call natural language > processing. I'm not aware of anyone using them for that actually. I'm > wondering if it's a good idea to postpone migrating them to an hypothetic= al > future as to me, the whole reasoning of the RFC applies to them. > > Regards, > Nicolas > --000000000000a789b305ce1145bd--