Newsgroups: php.internals Path: news.php.net Xref: news.php.net php.internals:87253 Return-Path: Mailing-List: contact internals-help@lists.php.net; run by ezmlm Delivered-To: mailing list internals@lists.php.net Received: (qmail 12274 invoked from network); 23 Jul 2015 18:25:26 -0000 Received: from unknown (HELO lists.php.net) (127.0.0.1) by localhost with SMTP; 23 Jul 2015 18:25:26 -0000 Authentication-Results: pb1.pair.com smtp.mail=francois@php.net; spf=unknown; sender-id=unknown Authentication-Results: pb1.pair.com header.from=francois@php.net; sender-id=unknown Received-SPF: unknown (pb1.pair.com: domain php.net does not designate 212.27.42.2 as permitted sender) X-PHP-List-Original-Sender: francois@php.net X-Host-Fingerprint: 212.27.42.2 smtp2-g21.free.fr Received: from [212.27.42.2] ([212.27.42.2:63036] helo=smtp2-g21.free.fr) by pb1.pair.com (ecelerity 2.1.1.9-wez r(12769M)) with ESMTP id 33/F1-00531-59131B55 for ; Thu, 23 Jul 2015 14:25:25 -0400 Received: from moorea (unknown [82.240.16.115]) by smtp2-g21.free.fr (Postfix) with ESMTP id 4A8F34B00AC for ; Thu, 23 Jul 2015 20:25:21 +0200 (CEST) Reply-To: To: Date: Thu, 23 Jul 2015 20:24:56 +0200 Message-ID: <001201d0c574$e1b63f80$a522be80$@php.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 14.0 Thread-Index: AdDFcNLu2Rud5FYsRd6450I99NsaGA== Content-Language: fr X-Antivirus: avast! (VPS 150723-0, 23/07/2015), Outbound message X-Antivirus-Status: Clean Subject: [Proposal] Extend support for negative string offsets From: francois@php.net (=?iso-8859-1?Q?Fran=E7ois_Laupretre?=) Hi, As my previous message seems to have been unnoticed, I send it again. I need your thoughts on a PR (https://github.com/php/php-src/pull/1431) = I am working on. The subject is to change the way negative string offsets = (and lengths, where applicable) are handled, to make the behavior consistent = with the way they're handled in substr() and substr_replace(). If your return = is positive, I'll write the corresponding RFC. Implemented so far : - Support negative string offsets in read mode. Example: $a =3D = $string{-2}; - Support negative string offsets in assignment mode. Example: = $string{-2} =3D 'z'; - strpos() : Accept negative values for the '$offset' argument - stripos() : Accept negative values for the '$offset' argument - substr_count(): Accept negative values for the '$offset' and '$length' parameters (same behavior as in substr()). The next step is to work on substr_compare(). This one brings a new = issue : as there's no default value for the '$length' param, it is impossible to perform a case-insensitive comparison without setting an explicit = length. If you want to use the default value, you must compute it = (strlen($main_str) - $offset). My first idea was to use 0 as default value, as this is a = quite useless value. But in order to maximize BC, I will propose to use NULL = as default value for the $length parameter in string functions. This will = mean 'up to the end of the string'. The documentation also needs to be unified because the quality is = sometimes very different between pages for quite similar functions (as an example, = see the difference between http://php.net/manual/en/function.strspn.php and http://php.net/manual/en/function.strcspn.php). I also have some questions : - Should we keep returning NULL on parameter parsing failure, while documentation only states that false is returned on failure ? - Would you support unifying error types and messages when offset and/or length are out-of-bound ? - In substr_compare(), an offset before the start of the string is = silently converted to 0, while the same condition generates a warning in every = other functions. Which behavior do you prefer ? Silently considering negative string positions as 0, or issue a warning ? All of this is creating BC breaks. Most of these changes are additions, = so they just authorize values that would have caused errors in previous versions. Others give a meaning to useless values (like NULL length). = So, these BC breaks can be considered as limited but they exist. So, what = are your thoughts about including these changes in 7.1 ? (I'd love including them in 7.0 but I'm afraid it's too late now). Regards Fran=E7ois