Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:107929 Return-Path: Delivered-To: mailing list internals@lists.php.net Received: (qmail 66195 invoked from network); 19 Dec 2019 15:17:53 -0000 Received: from unknown (HELO php-smtp3.php.net) (208.43.231.12) by pb1.pair.com with SMTP; 19 Dec 2019 15:17:53 -0000 Received: from php-smtp3.php.net (localhost [127.0.0.1]) by php-smtp3.php.net (Postfix) with ESMTP id 777AB2C047F for ; Thu, 19 Dec 2019 05:18:18 -0800 (PST) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on php-smtp3.php.net X-Spam-Level: X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.2 X-Spam-ASN: X-Spam-Virus: No Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by php-smtp3.php.net (Postfix) with ESMTPS for ; Thu, 19 Dec 2019 05:18:17 -0800 (PST) Received: by mail-lf1-x129.google.com with SMTP id y19so4314844lfl.9 for ; Thu, 19 Dec 2019 05:18:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZDaPpfxnMoGoyrtYs4RKC8FBVowGAGahcluCy00WLJ4=; b=jRhqJCguIyCwBBNlPjlo8Y3L4P1oAmVppoluo0wKYFrBKEKo2Be+UKXR+n6XBp9HVr GC1RSsgSi7W/D439ZS4V+fGR90urvNNWiUdpXoNGiN/bhAMdQ6FdtWA111F4h740rd7Q 7LBFr38b4Ka+T+3SV6V/Fjmn2qydrFkhnpH591aG0wlm7rr/4FoqG5e+nJgAUMCV7Bg4 M0EjxWKLpEHY5ZYCqHV+Avvg6+SbqJ7GS1/3ttNEjulmgm3Irhww01VbxTNUo0Zw1SXf kBpF4KPhpCc6J52iZqnSJpjRGjNMA0Be4cq76jMVMdh5pT5iutMpHiRt3Kck8uQDFldC hnSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZDaPpfxnMoGoyrtYs4RKC8FBVowGAGahcluCy00WLJ4=; b=FFRM4Xsu0SWX4mFxLhBVSgWjQmZheqBsOdk0BxWAccfMU6AX550lk7zF1rBaaG3m8J 3TeE60ellHAdBKOx0Y2rDxlfdBp+QEAL/up9x+bl6545wk0UD3S1TjpJ8wQLDjOAca3n RnAttLrQ4kVU0U2ODKK5Idp/jQp2MMRWXXu0jYQ0DdrABGmz15DzKEg0FIzNQw6w2Oem JBdWLMJxoVsgh7Cafi3eY2YsR7AbXU5mv6qE91oPE5sDdV8sxzDJfSuDY6OD/0LOTApH 8+A33scwwGDk+7WxBwCtYcKjc0VWX8AlsKm6MgBzfkbrExkg/vQ8YZm4Y0DeRPPPnjkl GmFQ== X-Gm-Message-State: APjAAAViDytshnM9vxndVm8rEQoFtgkYRogE52RbQaNSpNZ7j5NzZbae kG+TnXDHHVK2pXVv3HJFAgR9rPInE0+hC0NRmDI= X-Google-Smtp-Source: APXvYqw2ghvIc9WVrhloCm0BO3gv8S+F7RXeSoWDbgiPNREjvnNaGCErKvE24KXh3tcXd5RbDwPuHDmyzhFzUeO7C34= X-Received: by 2002:ac2:430d:: with SMTP id l13mr5650225lfh.112.1576761496547; Thu, 19 Dec 2019 05:18:16 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: Date: Thu, 19 Dec 2019 14:18:00 +0100 Message-ID: To: Claude Pache Cc: George Peter Banyard , PHP internals Content-Type: multipart/alternative; boundary="000000000000e5b62b059a0e648a" X-Envelope-From: Subject: Re: [PHP-DEV] Allowing variable strings to be defined for a string offset From: nikita.ppv@gmail.com (Nikita Popov) --000000000000e5b62b059a0e648a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Dec 16, 2019 at 12:21 PM Claude Pache wrote: > > > > Le 16 d=C3=A9c. 2019 =C3=A0 10:58, George Peter Banyard a =C3=A9crit > : > > > > Greetings internals, > > > > I'm here to get internals's opinion about allowing a string offset to b= e > > replaced with arbitrary strings. > > > > Since forever strings with more then one byte have been truncated > silently > > to one byte. The case with empty string *kinda* was existent but had a > > buggy behaviour (see bug 71572 [1]) and as such has been turned into a > > warning in PHP 7.1.0, and is meant to be promoted to an Error Exception > per > > the Reclassifying engine warnings RFC. [2] > > An illustration of both these cases is available on 3v4l.org [3] > > > > > > I've got an implementation ready as a pull request on GitHub [4] (which > > still needs some polishing for it to make CI happy and fix the various > > leaks). > > > > However, the question is if this behaviour is desirable. Moreover, the > > silent truncation of strings with more than one byte should be changed = to > > the same severity as the empty string case; i.e. an Error Exception. Th= is > > seems reasonable due to the possible loss of data. > > > > What ever the decision is, a BC break is to be expected. As code which > > inadvertently tries to assign multiple bytes to an offset will know hav= e > > all those bytes in the string whereas before only the first one was use= d > to > > replace the byte at the designated offset. On the other hand the > > introduction of an Error Exception is obviously a BC break but as it > points > > out to some kind of logic error. > > I would assume the impact to be minimal for both of these case. > > > > Any opinions? > > > > Best regards > > > > George Peter Banyard > > > > [1] https://bugs.php.net/bug.php?id=3D71572 > > [2] https://wiki.php.net/rfc/engine_warnings > > [3] https://3v4l.org/O0nEM > > [4] https://github.com/php/php-src/pull/5013 > > > Being able to replace one byte with an arbitrary string does not seem ver= y > useful to me, unless you add the functionality of replacing an arbitrary > substring with an arbitrary string (an equivalent of array_splice() for > strings). Your implementation gives the following potentially surprising > result: > > $str =3D "Hello world"; > var_dump(strlen($str)); // 11 > $str[0] =3D "=E3=81=82"; // replace the first byte with three bytes (assu= ming source > code is UTF-8-encoded) > $str[0] =3D "H"; // replace the first byte with one byte > var_dump(strlen($str)); // 13 > ?> > > IMO, the only reasonable thing to do at this point is just emitting a > Warning (or an Error). > I agree. As it's currently silently accepted, I'd recommend starting with a warning. Nikita --000000000000e5b62b059a0e648a--