Hello Internals,
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operators
The goal of this RFC is to reduce language complexity by making $v++ behave
like $v += 1 and $v-- behave like $v -= 1;
I am expecting the contentious part of the proposal to be the deprecation
of the PERL string increment feature to achieve the aforementioned goal.
However, I believe the benefits of aligning the behaviour of the
increment/decrement operators with addition/subtraction are larger than
keeping support for the PERL increment, which in its current state has
various shortcomings.
Best regards,
George P. Banyard
https://wiki.php.net/rfc/saner-inc-dec-operators
The goal of this RFC is to reduce language complexity by making $v++ behave
like $v += 1 and $v-- behave like $v -= 1;I am expecting the contentious part of the proposal to be the deprecation
of the PERL string increment feature to achieve the aforementioned goal.
Hi George,
Just to confirm, you're suggesting the following would be broken:
$ref = 'A';
$ref++;
var_dump($ref); // 'B'
$ref = 'Z';
$ref++;
var_dump($ref); // 'AA'
$ref++;
var_dump($ref); // 'AB'
I've seen this used a few times, e.g. starting with a numerical value (Passport number, NHS number, Social Security Number, Date of Birth 20230117), and the developer simply appends an incrementing letter on the end to get a unique reference; e.g. a person having multiple assessments... especially if it's more than 26 (A-Z), and you need to move to multiple letters, which chr(90 + 1)
cannot help you with.
That said, I appreciate that incrementing some strings can be a bit unusual (e.g. "A9" to "B0", vs "A 9" to "A 0").
Craig
I've seen this used a few times, e.g. starting with a numerical value (Passport number, NHS number, Social Security Number, Date of Birth 20230117), and the developer simply appends an incrementing letter on the end to get a unique reference; e.g. a person having multiple assessments... especially if it's more than 26 (A-Z), and you need to move to multiple letters, which
chr(90 + 1)
cannot help you with.
Being able to increment alpha strings is incredibly useful when working
with Excel spreadsheets (as I do on a daily basis), because the column
Ids match this pattern; and I would hate to see this deprecated. Having
to replicate that logic for traversing column Ids in userland code would
be inconvenient (to say the least), would affect many of the users of my
libraries, and would have a performance impact on my libraries. If
anything, I'd rather like to see the decrement operator work with alpha
strings as well for more consistency.
I don't have the karma for a vote; but if I did then it would be a "No"
for this alone, because I can see the problems that it will cause me and
the users of my libraries.
That said, I appreciate that incrementing some strings can be a bit unusual (e.g. "A9" to "B0", vs "A 9" to "A 0").
Agreed. While incrementing works in a very logical manner with mixed
alphanumeric strings, it's not well documented behaviour, and most
developers take a long time before they understand what it's actually
doing. While there might be use cases for incrementing alphanumerics, I
suspect that it would be better implemented in the business logic of an
application, because the component parts of that string are likely to
have business meaning; and also to provide better code readability.
--
Mark Baker
I've seen this used a few times, e.g. starting with a numerical value
(Passport number, NHS number, Social Security Number, Date of Birth
20230117), and the developer simply appends an incrementing letter on the
end to get a unique reference; e.g. a person having multiple assessments...
especially if it's more than 26 (A-Z), and you need to move to multiple
letters, whichchr(90 + 1)
cannot help you with.Being able to increment alpha strings is incredibly useful when working
with Excel spreadsheets (as I do on a daily basis), because the column
Ids match this pattern; and I would hate to see this deprecated. Having
to replicate that logic for traversing column Ids in userland code would
be inconvenient (to say the least), would affect many of the users of my
libraries, and would have a performance impact on my libraries. If
anything, I'd rather like to see the decrement operator work with alpha
strings as well for more consistency.I don't have the karma for a vote; but if I did then it would be a "No"
for this alone, because I can see the problems that it will cause me and
the users of my libraries.That said, I appreciate that incrementing some strings can be a bit
unusual (e.g. "A9" to "B0", vs "A 9" to "A 0").Agreed. While incrementing works in a very logical manner with mixed
alphanumeric strings, it's not well documented behaviour, and most
developers take a long time before they understand what it's actually
doing. While there might be use cases for incrementing alphanumerics, I
suspect that it would be better implemented in the business logic of an
application, because the component parts of that string are likely to
have business meaning; and also to provide better code readability.
I appreciate being shown concrete cases about the useful ness of this
operation.
The reason I didn't go with adding support for decrementing alphanumeric
strings is that it was unanimously rejected.
However, if Rowan's suggestion of adding
string_increment()/string_decrement() with more rigorous behaviour (that we
can flesh out together) would be part of this proposal, would you be more
inclined to accept deprecating ++ from performing this feature?
I truly believe having $v++ behave like $v += 1 and $v-- behave like $v -=
1; is something to strive for because it allows us to remove one
dedicated type juggling context people need to be aware of and simplifies
the overall semantics of the language.
Keeping support for string increments means that one cannot interchange $v++
and $v += 1 and that one needs to be aware about using it when a value
might hold a string.
As such, if it needs to remain its own type juggling context, the question
is why not make it stricter by having it warn and then throw a TypeError on
bool, reopening the can of worms that is the null handling between both
operators and what to do with the empty string case.
These questions are already answered by making those operators behave just
like addition/subtraction.
My order of preference for the semantics are as follows:
- The behaviour described in the RFC (with or without function for string
in/decrement) - (with a massive gap, but I could live with it) adding support for string
decrements and tiding up the behaviour of the alphanumeric string to make
it stricter and less error-prone. - The current asymmetry (again with a massive gap between this and option
But because option 2 seems out of the question due to the unanimous
rejection of https://wiki.php.net/rfc/alpanumeric_decrement, the only
viable options to me seem like 1 and 3.
As I hate option 3 I am pushing for option 1 as I think it has various
benefits.
Moreover, I do not want to split this into its own proposal (deprecating
string increments/figuring out what to do with them) as I feel it will make
any attempt to improve the situation harder.
Best regards,
George P. Banyard
Den 2023-01-18 kl. 13:22, skrev G. P. B.:
I've seen this used a few times, e.g. starting with a numerical value
(Passport number, NHS number, Social Security Number, Date of Birth
20230117), and the developer simply appends an incrementing letter on the
end to get a unique reference; e.g. a person having multiple assessments...
especially if it's more than 26 (A-Z), and you need to move to multiple
letters, whichchr(90 + 1)
cannot help you with.Being able to increment alpha strings is incredibly useful when working
with Excel spreadsheets (as I do on a daily basis), because the column
Ids match this pattern; and I would hate to see this deprecated. Having
to replicate that logic for traversing column Ids in userland code would
be inconvenient (to say the least), would affect many of the users of my
libraries, and would have a performance impact on my libraries. If
anything, I'd rather like to see the decrement operator work with alpha
strings as well for more consistency.I don't have the karma for a vote; but if I did then it would be a "No"
for this alone, because I can see the problems that it will cause me and
the users of my libraries.That said, I appreciate that incrementing some strings can be a bit
unusual (e.g. "A9" to "B0", vs "A 9" to "A 0").Agreed. While incrementing works in a very logical manner with mixed
alphanumeric strings, it's not well documented behaviour, and most
developers take a long time before they understand what it's actually
doing. While there might be use cases for incrementing alphanumerics, I
suspect that it would be better implemented in the business logic of an
application, because the component parts of that string are likely to
have business meaning; and also to provide better code readability.I appreciate being shown concrete cases about the useful ness of this
operation.
The reason I didn't go with adding support for decrementing alphanumeric
strings is that it was unanimously rejected.
However, if Rowan's suggestion of adding
string_increment()/string_decrement() with more rigorous behaviour (that we
can flesh out together) would be part of this proposal, would you be more
inclined to accept deprecating ++ from performing this feature?I truly believe having $v++ behave like $v += 1 and $v-- behave like $v -=
1; is something to strive for because it allows us to remove one
dedicated type juggling context people need to be aware of and simplifies
the overall semantics of the language.
Keeping support for string increments means that one cannot interchange $v++
and $v += 1 and that one needs to be aware about using it when a value
might hold a string.
As such, if it needs to remain its own type juggling context, the question
is why not make it stricter by having it warn and then throw a TypeError on
bool, reopening the can of worms that is the null handling between both
operators and what to do with the empty string case.
These questions are already answered by making those operators behave just
like addition/subtraction.My order of preference for the semantics are as follows:
- The behaviour described in the RFC (with or without function for string
in/decrement)- (with a massive gap, but I could live with it) adding support for string
decrements and tiding up the behaviour of the alphanumeric string to make
it stricter and less error-prone.- The current asymmetry (again with a massive gap between this and option
But because option 2 seems out of the question due to the unanimous
rejection of https://wiki.php.net/rfc/alpanumeric_decrement, the only
viable options to me seem like 1 and 3.
As I hate option 3 I am pushing for option 1 as I think it has various
benefits.
Since the alpanumeric_decrement RFC was rejected january 2014 9 years ago,
could it be an option to bring it up again in conjunctione with your RFC?
Maybe the added value of your RFC could swing the opinion. I mean there has
been RFC's that required multiple tries to fly.
Regards //Björn L
On Wed, 18 Jan 2023 at 14:35, Björn Larsson bjorn.x.larsson@telia.com
wrote:
Since the alpanumeric_decrement RFC was rejected january 2014 9 years ago,
could it be an option to bring it up again in conjunctione with your RFC?Maybe the added value of your RFC could swing the opinion. I mean there has
been RFC's that required multiple tries to fly.
Possibly, and I could wait for the result of such an RFC, but I do not
intend on pushing this forward.
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operatorsThe goal of this RFC is to reduce language complexity by making $v++
behave like $v += 1 and $v-- behave like $v -= 1;If that is the goal, then I would agree with this RFC.
However, changing the PERL string increment feature does not IMO fit
into that goal, and it also a useful feature. On that base I would
vote against this. And I suspect many others would as well.
I do not understand how this does not fit into that goal.
$s = "a10";
$s += 1;
var_dump($s);
Results in a TypeError whereas
$s = "a10";
$s++;
var_dump($s);
Results in string(3) "a11"
Therefore, $s++ does not behave like $s += 1; and thus in scope.
Is there a way to avoid this single useful feature from being
deprecated, while to good parts of this RFC stay?
Yes, but at that point, I don't see why we should unify the behaviour if it
is going to remain inconsistent. Might as well make incrementing on bool,
and decrementing null a TypeError in PHP 9 to make it stricter.
I am also unsure as how much actual breakage this would cause, and
before this gets up to a vote, I would like to see how bad (or not) this
would affect already existing code bases.
Fair point, I can try and run Nikita's script on the top composer packages,
but that won't show the state of private codebases.
On Wed, 18 Jan 2023 at 16:03, Levi Morrison levi.morrison@datadoghq.com
wrote:
It seems to me that if you truly want to clean up this specific part
of the language, you are going to have to play the long game:
- New functions are added for the perl behavior of string increment
and decrement. No warnings are given in code, but call-outs are made
in upgrading and other documentation about this behavior changing.
Note that in the past I would have used anE_STRICT
for this, but
people seem opposed to adding newE_STRICT
warnings.- In the next minor version, we add a warning about the behavior
when string increment/decrement is used.- In the next major version, we finally clean up the behavior.
But this gets muddy if we do PHP 8.3 for step 1, and then we decide to
go for PHP 9.0 instead of 8.4, and it messes with the "ideal" cycle.Note that I support this sort of plan, and would support it for
cleaning up many other parts of PHP as well. It's just unfortunate it
takes so long, but that's how it goes sometimes :/
I don't think we need such a long timeline because the function is easily
poly filled.
Moreover, if people jump a version in an upgrade, they are still going to
immediately receive a warning/deprecation.
But if such a timeline is preferred, I do not mind changing it.
Classes and methods is the expected way of implementing standard library in
an OO language. New APIs (such as the new Random api) use OO instead of
functions and it makes more sense to use OO in this case too: there's
likely a place for other methods too, like toBase(int $otherBase) etc. It
would also be possible to use overloaded operators if needed.
Until we have strings that can invoke methods, I don't see the point of
having an OO API.
PHP is a multi paradigm language, and creating a class with two methods
seems very useless to me.
OOP is favoured in PHP because using functions is just an overall terrible
experience that needs improvements, but using functional patterns is
totally doable (and can produce elegant code) in PHP.
George P. Banyard
On Wed, 18 Jan 2023 at 16:03, Levi Morrison levi.morrison@datadoghq.com
wrote:It seems to me that if you truly want to clean up this specific part
of the language, you are going to have to play the long game:
- New functions are added for the perl behavior of string increment
and decrement. No warnings are given in code, but call-outs are made
in upgrading and other documentation about this behavior changing.
Note that in the past I would have used anE_STRICT
for this, but
people seem opposed to adding newE_STRICT
warnings.- In the next minor version, we add a warning about the behavior
when string increment/decrement is used.- In the next major version, we finally clean up the behavior.
But this gets muddy if we do PHP 8.3 for step 1, and then we decide to
go for PHP 9.0 instead of 8.4, and it messes with the "ideal" cycle.Note that I support this sort of plan, and would support it for
cleaning up many other parts of PHP as well. It's just unfortunate it
takes so long, but that's how it goes sometimes :/I don't think we need such a long timeline because the function is easily
poly filled.
Moreover, if people jump a version in an upgrade, they are still going to
immediately receive a warning/deprecation.
But if such a timeline is preferred, I do not mind changing it.Classes and methods is the expected way of implementing standard library in
an OO language. New APIs (such as the new Random api) use OO instead of
functions and it makes more sense to use OO in this case too: there's
likely a place for other methods too, like toBase(int $otherBase) etc. It
would also be possible to use overloaded operators if needed.Until we have strings that can invoke methods, I don't see the point of
having an OO API.
PHP is a multi paradigm language, and creating a class with two methods
seems very useless to me.
OOP is favoured in PHP because using functions is just an overall terrible
experience that needs improvements, but using functional patterns is
totally doable (and can produce elegant code) in PHP.George P. Banyard
I agree that adding a str_increment() function and then (later) deprecating and removing the ++ functionality is the way to go. I defer to George if it makes sense to fix the other parts of ++ now and wait for 9 for string-incrementing, or just wait and get rid of them all at once.
While str_increment() is less ergonomic than ++, I think the improved type consistency is worth it. And it's trivial to wrap up into a generator if one wants something shorter:
function cols(string $s): \Generator
{
while (true) {
$s = str_increment($s);
yield $s;
}
}
$it = cols('A');
$it->next(); // gives B
And that can be made parallel with something on the number side if you really want.
--Larry Garfield
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operatorsThe goal of this RFC is to reduce language complexity by making $v++
behave like $v += 1 and $v-- behave like $v -= 1;
If that is the goal, then I would agree with this RFC.However, changing the PERL string increment feature does not IMO fit
into that goal, and it also a usefulfeature. On that base I would
vote against this. And I suspect many others would as well.
However, the ++ and -- are the "Increment" and "Decrement" operators,
not the Add1 and Subtract1 operators; while they behave in that way when
used with variables containing numeric values, they are special
operators and not simply a syntactic sugar for +=1 and -=1. As long as
their behaviour is consistent, and definition of what "Increment" and
"Decrement" mean is clearly defined for different datatypes, then I feel
that the PERL-style alpha string increment has enough valid use cases to
justify itself.
We might also discuss consistency of datatype changes when these
operators are used.
$a = PHP_INT_MAX;
++$a;
or
$a = '10';
++$a;
both change the datatype of $a; which isn't documented behaviour either.
--
Mark Baker
However, the ++ and -- are the "Increment" and "Decrement" operators,
not the Add1 and Subtract1 operators; while they behave in that way when
used with variables containing numeric values, they are special
operators and not simply a syntactic sugar for +=1 and -=1. As long as
their behaviour is consistent, and definition of what "Increment" and
"Decrement" mean is clearly defined for different datatypes, then I feel
that the PERL-style alpha string increment has enough valid use cases to
justify itself.
I understand your point, but also understand why increment/decrement
operations can be seen as adding/subtracting 1 (the current
documentation is actually pretty close to that interpretation). I'm not
sure which interpretation I'd prefer.
We might also discuss consistency of datatype changes when these
operators are used.$a = PHP_INT_MAX;
++$a;
or
$a = '10';
++$a;
both change the datatype of $a; which isn't documented behaviour either.
Well, that is the usual type juggling, but is indeed not properly
document, from what I can tell (the "type juggling" page doesn't mention
increment/decrement, and the "incrementing/decrementing operators" page
doesn't mention the type juggling).
--
Christoph M. Becker
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operatorsThe goal of this RFC is to reduce language complexity by making $v++
behave like $v += 1 and $v-- behave like $v -= 1;
If that is the goal, then I would agree with this RFC.However, changing the PERL string increment feature does not IMO fit
into that goal, and it also a usefulfeature. On that base I would
vote against this. And I suspect many others would as well.However, the ++ and -- are the "Increment" and "Decrement" operators,
not the Add1 and Subtract1 operators; while they behave in that way when
used with variables containing numeric values, they are special
operators and not simply a syntactic sugar for +=1 and -=1. As long as
their behaviour is consistent, and definition of what "Increment" and
"Decrement" mean is clearly defined for different datatypes, then I feel
that the PERL-style alpha string increment has enough valid use cases to
justify itself.
That's a strange hill to die on, most people would expect that those
operators do indeed behave like Add1 and Sub1, and clearly you are not
having any issue with making -- act like Sub1.
There is even a user note on the manual page from 21y ago where someone was
expecting booleans to be incremented. [1]
Moreover, the alphanumeric string increment feature is fundamentally broken
and unsound due to the simple fact that PHP supports converting numeric
strings written in scientific notation to float.
And this behaviour of casting a numeric string in scientific notation
always takes precedent over the alphanumeric increment.
The following code samples show this perfectly:
$s = "5d9";
var_dump(++$s); // string(3) "5e0"
var_dump(++$s); // float(6)
$s = "5e9";
var_dump(++$s); // float(5000000001)
var_dump(++$s); // float(5000000002)
Behaviour that also has a user note on the manual page. [2]
It is possible to have a sound implementation of it in userland, via the
following function:
function polyfill(string $s): string {
if (is_numeric($s)) {
$offset = stripos($s, 'e');
if ($offset !== false) {
/* Using increment operator would cast the string to float
* Therefore we manually increment it to convert it to an
"f"/"F" that doesn't get affected */
$c = $s[$offset];
$c++;
$s[$offset] = $c;
$s++;
$s[$offset] = match ($s[$offset]) {
'f' => 'e',
'F' => 'E',
'g' => 'f',
'G' => 'F',
};
return $s;
}
}
return ++$s;
}
We might also discuss consistency of datatype changes when these
operators are used.$a = PHP_INT_MAX;
++$a;
or
$a = '10';
++$a;
both change the datatype of $a; which isn't documented behaviour either.
Those are just "regular" well documented type coercions due to addition and
int to float promotions.
And yes, the PHP documentation could be better, but no one is paid to work
on it.
However, it turns out that I only work part-time for The PHP Foundation,
and therefore I am available to be hired to do some technical writing for
the PHP Documentation.
(If anyone does want to take me up on this offer, do feel free to email me).
In any case, I've updated the RFC to version 0.3, [3] with a whole section
about the current behaviour of the PERL increment implementation in PHP, a
native implementation of str_increment() and str_decrement() that handles
strings that can be interpreted as numbers written in scientific notation.
It also changes the timeline slightly, as this seems the preferred course
of action.
Best regards,
George P. Banyard
[1] https://www.php.net/manual/en/language.operators.increment.php#16149
[2] https://www.php.net/manual/en/language.operators.increment.php#109621
[3] https://wiki.php.net/rfc/saner-inc-dec-operators
That's a strange hill to die on, most people would expect that those
operators do indeed behave like Add1 and Sub1, and clearly you are not
having any issue with making -- act like Sub1.
It isn't so strange, when you consider that an OS library like
PhpSpreadsheet with over 100million installs makes extensive use of
alpha and numeric increment (not alphanumeric); and I dearly wish that
PHP had implemented the decrement operator for alpha strings as well,
because that would have allowed me to simplify the codebase even
further; so I would prefer if -- acted like a decrement for alpha
strings, and not simply as a Sub1 for numerics.
The documentation page consistently uses the word Increment and
Decrement, not Add 1 and Subtract 1.
Developers who read the documentation should be aware of the Perl
convention when dealing with alphabetic strings, and should expect that
behaviour. Alphanumeric strings are certainly more problematic, less
well documented, and less well understood, and I'll agree that they're
inconsistent in their behaviour.
Deprecating the Increment operator for strings will create extra work
for me, will affect many of the users of my library, and I'm certain it
will also have a performance impact on the library (replacing that
operation with a more expensive function call for alpha increments, but
still having the operation for numeric increments). So yes, I am willing
to die on this hill because that deprecation will have a very direct and
adverse affect on my work.
--
Mark Baker
I don't think it's such a huge issue as you make it to be. The
documentation states this only as an alternative:
https://phpspreadsheet.readthedocs.io/en/latest/topics/accessing-cells/#looping-through-cells-using-indexes
It also mentions the pitfalls. I doubt many users would prefer that
"alternative" given that the recommended way is simpler and does the same
thing.
I also don't think that performance would come into play here. Any
difference would be insignificant.
I believe that the benefits of the deprecation outweigh any potential extra
work the developers may have.
I don't think it's such a huge issue as you make it to be. The
documentation states this only as an alternative:
https://phpspreadsheet.readthedocs.io/en/latest/topics/accessing-cells/#looping-through-cells-using-indexesIt also mentions the pitfalls. I doubt many users would prefer that
"alternative" given that the recommended way is simpler and does the same
thing.
I also don't think that performance would come into play here. Any
difference would be insignificant.
I believe that the benefits of the deprecation outweigh any potential extra
work the developers may have.
I beg to differ. Having done a lot of work last year recommending the
deprecation of using of column indexes (1,2,3,etc) in favour of
addresses ('A', 'B', 'C', etc) to more closely match with Excel's
layout; and knowing how much the codebase uses alpha column incrementing
internally, I'm aware of how much an issue it is.
I'd prefer if developers using PhpSpreadsheet used the built-in
iterators, and Im in the middle of writing a post on the benefits of
doing just that; but "under the hood" alpha column names are used, not
numeric indexes'; so a lot of the "under the hood" code uses ++ and --
with row numbers and column addresses.
And replacing any PHP operator with a function call is always going to
be less performant.
The benefits of this deprecation seem to be making ++ and -- consistent
by making them little more than a syntactic sugar for numeric values
only; I'm afraid I don't see that as a benefit; but as a retrograde
step, and not the same as making ++ and -- consistent.
--
Mark Baker
The documentation page consistently uses the word Increment and
Decrement, not Add 1 and Subtract 1.Developers who read the documentation should be aware of the Perl
convention when dealing with alphabetic strings, and should expect that
behaviour. Alphanumeric strings are certainly more problematic, less
well documented, and less well understood, and I'll agree that they're
inconsistent in their behaviour.
The PHP documentation has never been the source of truth about the PHP
implementation,
if it was dynamics properties should have been removed without any notice
as until them being deprecated there was no documentation.
So arguing something should behave a certain way because the docs are
written in a certain way holds no value to me.
The state of the PHP docs could certainly be improved, as it is blatantly
lying at various core sections.
However, the whole point of this RFC is to remove cognitive burden for
developers, so they don't even need to be aware of this "feature" and not
get surprised when it kicks in.
Moreover, by your logic, you wouldn't care if we removed support for
alphanumeric strings and only let the PERL increment kick in for purely
alphabetical.
While convenient for you, someone might actually use this feature on
alphanumeric strings, and we're back to "why is my use case being removed
while that other just as weird one remains".
I even went initially for such a proposal (see previous revision [1]),
however, I realized this provides minimal benefit as it doesn't reduce at
all the cognitive burden or the overall design specification of the
language.
Deprecating the Increment operator for strings will create extra work
for me, will affect many of the users of my library, and I'm certain it
will also have a performance impact on the library (replacing that
operation with a more expensive function call for alpha increments, but
still having the operation for numeric increments). So yes, I am willing
to die on this hill because that deprecation will have a very direct and
adverse affect on my work.
If the issue is about performance, it is possible to enhance the optimizer
to inline those two functions (something the optimizer already attempts to
do for userland functions [2]).
The other aspect goes back to the typical conundrum: Do we think PHP is in
decline?
- If yes, then not doing breaking changes and catering to legacy projects
is definitely the course of action to take. - If no, improving PHP for the next generation of developers and software,
so that it is easier to learn and reason about, should be the course of
action to take.
As I personally think we are in the second situation, that's why I'm
tackling this subject in this manner.
It's the same belief that made us deprecate dynamic properties, convert
warnings to Errors, etc.
Now, if we are in the former case, then you are totally justified, and I
probably should find another job as I see no point in not improving the
overall language semantics.
George P. Banyard
[1] https://wiki.php.net/rfc/saner-inc-dec-operators?rev=1669977388
[2] See zend_try_inline_call() function
However, the whole point of this RFC is toremove cognitive burden for
developers, so they don't even need to be aware of this "feature" and not
get surprised when it kicks in.
Moreover, by your logic, you wouldn't care if we removed support for
alphanumeric strings and only let the PERL increment kick in for purely
alphabetical.
While convenient for you, someone might actually use this feature on
alphanumeric strings, and we're back to "why is my use case being removed
while that other just as weird one remains".
I make no judgement on alphanumeric strings, other than I can't see any
use case for it myself, so I won't allow my objection be considered
hypocritical; and your definition of my use case as "weird" is highly
judgemental.
Bijective numeration using the letters of the alphabet has a long and
ancient tradition, pre-dating our modern numeric Hindu-Arabic system
using base 10 for place/value notation by many centuries. The Abjadi
system used the 28 letters of the Arabic alphabet; similarly the ancient
Greeks and Hebrews, the Armenians; by Russia until the early 18th
Century (each culture using their own alphabet). It's ironic that the
Romans used a very different system, even though our modern western
alphabet is based on the Roman alphabet.
These civilisations didn't consider their alphabetic numeral system "weird".
How many of the irregularities and idiosyncracies of alphanumeric
strings could be resolved by not trying to cast them as a numeric value
before increment/decrement; but by treating them consistently as
strings? It would resolve the discrepancy with "5d9"; although not with
"0xf9".
--
Mark Baker
Hey All
However, the whole point of this RFC is toremove cognitive burden for
developers, so they don't even need to be aware of this "feature" and not
get surprised when it kicks in.Moreover, by your logic, you wouldn't care if we removed support for
alphanumeric strings and only let the PERL increment kick in for purely
alphabetical.
While convenient for you, someone might actually use this feature on
alphanumeric strings, and we're back to "why is my use case being removed
while that other just as weird one remains".I make no judgement on alphanumeric strings, other than I can't see any
use case for it myself, so I won't allow my objection be considered
hypocritical; and your definition of my use case as "weird" is highly
judgemental.Bijective numeration using the letters of the alphabet has a long and
ancient tradition, pre-dating our modern numeric Hindu-Arabic system
using base 10 for place/value notation by many centuries. The Abjadi
system used the 28 letters of the Arabic alphabet; similarly the ancient
Greeks and Hebrews, the Armenians; by Russia until the early 18th
Century (each culture using their own alphabet). It's ironic that the
Romans used a very different system, even though our modern western
alphabet is based on the Roman alphabet.These civilisations didn't consider their alphabetic numeral system
"weird".How many of the irregularities and idiosyncracies of alphanumeric
strings could be resolved by not trying to cast them as a numeric value
before increment/decrement; but by treating them consistently as
strings? It would resolve the discrepancy with "5d9"; although not with
"0xf9".
The thing that I consider "weird" and that I would really love to see
addressed in a future version of PHP is that the increment of strings
only works with a-z|A-Z while there are a lot of other alphabets where
that should work similar. See https://3v4l.org/k0Nti for such an example.
Incrementing a string is something that people should only do when
they know what they are doing. That numeric strings are incremented one
way and other strings another one is indeed something that can irritate
people. And due to the missing type-system in former times it made
sense. But given the by now available type-system using a "sane"
increment could be as easy as ++(int)$var
.
That way it is clear that whatever is in the variable $var should be
treated as integer (or float) and be incremented after the conversion
That way the already existing feature - it is clearly specified in the
documentation and seems to be specified there for the last 20+ years -
would stay the same while still allowing people to easily make sure that
the increment way they expect is used.
Yes! That would mean that $i = "9"; echo ++$i
would output A
,
whereas echo ++(int)$i
would output 10
. But I find that very
intuitive and sane.
Just my 0.02€
Cheers
Andreas
--
,,,
(o o)
+---------------------------------------------------------ooO-(_)-Ooo-+
| Andreas Heigl |
| mailto:andreas@heigl.org N 50°22'59.5" E 08°23'58" |
| https://andreas.heigl.org |
+---------------------------------------------------------------------+
| https://hei.gl/appointmentwithandreas |
+---------------------------------------------------------------------+
| GPG-Key: https://hei.gl/keyandreasheiglorg |
+---------------------------------------------------------------------+
In any case, I've updated the RFC to version 0.3, [3] with a whole section
about the current behaviour of the PERL increment implementation in PHP, a
native implementation of str_increment() and str_decrement() that handles
strings that can be interpreted as numbers written in scientific notation.
It also changes the timeline slightly, as this seems the preferred course
of action.
This seems like a reasonable plan to me, thanks!
--Larry Garfield
[...]
I appreciate being shown concrete cases about the useful ness of this operation.
The reason I didn't go with adding support for decrementing alphanumeric
strings is that it was unanimously rejected.
However, if Rowan's suggestion of adding
string_increment()/string_decrement() with more rigorous behaviour (that we
can flesh out together) would be part of this proposal, would you be more
inclined to accept deprecating ++ from performing this feature?
Hi George,
I don't have a vote at the moment (I think I need one more RFC to pass)... but you might be able to convince me, I just like to know that breakages are really worth it, because my biggest issue is trying to get developers to upgrade their PHP installs (a lot are still on 7.4).
I agree that some of the incrementing behaviour can be a bit weird, and I would be happy to see those be deprecated/removed; but I worry that the A, B, ..., Z, AA, AB, etc is something that works well today, and is likely to be tricky to find/replace with a new function in all existing code.
At the moment I'd prefer Option 2 or 3, with the focus being on "tiding up the behaviour of the alphanumeric string to make it stricter and less error-prone."
Craig
I agree that some of the incrementing behaviour can be a bit weird, and I would be happy to see those be deprecated/removed; but I worry that the A, B, ..., Z, AA, AB, etc is something that works well today, and is likely to be tricky to find/replace with a new function in all existing code.
Replacing the use of an existing operator with a new function call, but
only in certain circumstances (for alpha increments but not for numeric
increments) would also be a pain; we'd have to examine every instance of
++ use to see what datatype it was being used on (SA tools won't
necessarily help with that); and when looping over Excel rows and
columns it would seem strange allowing the ++ operator for rows, but
having to use a function call for columns when all previous code also
used ++.
--
Mark Baker
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operatorsThe goal of this RFC is to reduce language complexity by making $v++ behave
like $v += 1 and $v-- behave like $v -= 1;
Hi George,
Thanks for tackling this. I heartily agree with the aims of this RFC;
the current situation is a weird mess of special cases, most of which
are justified only by "it's always been that way".
I am expecting the contentious part of the proposal to be the deprecation
of the PERL string increment feature to achieve the aforementioned goal.
As Craig and Mark point out, this functionality does have legitimate
uses, and I am generally of the opinion that deprecations should either
be of broken functionality (as is often the case when upgrading Warnings
to Errors) or come with specific instructions for a replacement.
Perhaps therefore we should introduce a new function, string_inc, as the
official migration path for deliberate use of this feature. This could
give the same result as the current increment operator for supported
cases, but throw an Error for cases that would currently be left
unchanged. A polyfill using the existing operator on old versions would
look something like this:
function string_inc(string $input): string {
$output = $input;
@ ++$output;
if ( $input === $output ) {
throw new Error("Unsupported input to string_inc: '$input'");
}
return $output;
}
Without the need to interact with other types or existing behaviour, I
think an accompanying string_dec would also be less controversial than
previous RFCs. Ambiguous cases like string_dec("a"), string_dec(""), and
string_dec("0") could simply throw an Error.
Regards,
--
Rowan Tommins
[IMSoP]
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operatorsThe goal of this RFC is to reduce language complexity by making $v++
behave like $v += 1 and $v-- behave like $v -= 1;
If that is the goal, then I would agree with this RFC.
However, changing the PERL string increment feature does not IMO fit
into that goal, and it also a useful feature. On that base I would
vote against this. And I suspect many others would as well.
Is there a way to avoid this single useful feature from being
deprecated, while to good parts of this RFC stay?
I am also unsure as how much actual breakage this would cause, and
before this gets up to a vote, I would like to see how bad (or not) this
would affect already existing code bases.
cheers,
Derick
It seems to me that if you truly want to clean up this specific part
of the language, you are going to have to play the long game:
- New functions are added for the perl behavior of string increment
and decrement. No warnings are given in code, but call-outs are made
in upgrading and other documentation about this behavior changing.
Note that in the past I would have used anE_STRICT
for this, but
people seem opposed to adding newE_STRICT
warnings. - In the next minor version, we add a warning about the behavior
when string increment/decrement is used. - In the next major version, we finally clean up the behavior.
But this gets muddy if we do PHP 8.3 for step 1, and then we decide to
go for PHP 9.0 instead of 8.4, and it messes with the "ideal" cycle.
Note that I support this sort of plan, and would support it for
cleaning up many other parts of PHP as well. It's just unfortunate it
takes so long, but that's how it goes sometimes :/
The goal of this RFC is to reduce language complexity by making $v++
behave like $v += 1 and $v-- behave like $v -= 1;If that is the goal, then I would agree with this RFC.
However, changing the PERL string increment feature does not IMO fit
into that goal
I agree, changing its behaviour does not achieve that goal; only removing the behaviour (with an appropriate deprecation period and upgrade path) achieves the goal.
(For completeness, the only other way to achieve the goal would be to add support for string increments to the += operator, and presumably also the + operator to avoid a different inconsistency. That is, make 'a' + 5 === 'f'. I don't think that's even worth considering, but it's the only other way to achieve consistency.)
Regards,
--
Rowan Tommins
[IMSoP]
I like this proposal and I support making the language consistent. I wasn't
aware there were so many inconsistencies with increment/decrement
operators.
When I read the RFC I was a little sceptical about the deprecation of
string increment functionality. It's something I used in the past and I see
no easy upgrade path. However, after reading this thread and thinking it
over, I realize that deprecation is the right way to go. Someone said that
it's useful when working with Excel. Excel uses bijective base-26 system.
PHP does not. I cannot even explain what logic governs PHP string increment
functionality.
$s = "az";
var_dump(++$s); // string(2) "ba"
$s = "a9";
var_dump(++$s); // string(2) "b0"
$s = "99";
var_dump(++$s); // int(100)
$s = "zZ";
var_dump(++$s); // string(3) "aaA"
$s = "9D9";
var_dump(++$s); // string(3) "9E0"
$s = "9E0";
var_dump(++$s); // float(10)
$s = "CHEAP BED";
var_dump(++$s); // string(9) "CHEAP BEE"
Strings should not be incrementable unless they are numeric strings. The
current "feature" is more like a bug from xkcd comic. https://xkcd.com/1172/
But as there is a real need for a similar functionality, for example when
working with Excel, I would propose to add a class into the language that
is able to calculate and iterate any bijective base system. It needs to
have a clear functional spec and should support both increment/decrement
operators as well as iterators. I see this as the only way out of this
mess. This RFC needs to pass, but it cannot pass without an alternative for
people who actually use this "feature".
The PHP manual says: "The increment/decrement operators only affect numbers
and strings. Arrays, objects, booleans and resources are not affected.
Decrementing null values has no effect too, but incrementing them results
in 1."
But that's not true. You cannot increment an array or resource as it would
trigger an error. But incrementing false/true doesn't generate any errors.
It's very inconsistent and misleading.
Le 18 janv. 2023 à 18:27, Kamil Tekiela tekiela246@gmail.com a écrit :
Strings should not be incrementable unless they are numeric strings. The
current "feature" is more like a bug from xkcd comic. https://xkcd.com/1172/But as there is a real need for a similar functionality, for example when
working with Excel, I would propose to add a class into the language that
is able to calculate and iterate any bijective base system. It needs to
have a clear functional spec and should support both increment/decrement
operators as well as iterators. I see this as the only way out of this
mess. This RFC needs to pass, but it cannot pass without an alternative for
people who actually use this "feature".
For those that lack imagination about possible use cases, here is mine: generating unique (in the scope of the request) alphabetic ids:
function nextid(): string {
static $id = 'zz';
return ++$id;
}
But no over-engineering please: no class and no decrement equivalent (the latter could be added in a separate RFC if it is really deemed useful), just a plain function that replicate the current behaviour for strings of the form /^[A-Za-z0-9]*$/, minus the bugs around the peculiar notion of “numeric string” (e.g., "9E1" equivalent to 90).
—Claude
Le 18 janv. 2023 à 18:27, Kamil Tekiela tekiela246@gmail.com a écrit :
Strings should not be incrementable unless they are numeric strings. The
current "feature" is more like a bug from xkcd comic.
https://xkcd.com/1172/But as there is a real need for a similar functionality, for example when
working with Excel, I would propose to add a class into the language that
is able to calculate and iterate any bijective base system. It needs to
have a clear functional spec and should support both increment/decrement
operators as well as iterators. I see this as the only way out of this
mess. This RFC needs to pass, but it cannot pass without an alternative
for
people who actually use this "feature".For those that lack imagination about possible use cases, here is mine:
generating unique (in the scope of the request) alphabetic ids:function nextid(): string {
static $id = 'zz';
return ++$id;
}But no over-engineering please: no class and no decrement equivalent (the
latter could be added in a separate RFC if it is really deemed useful),
just a plain function that replicate the current behaviour for strings of
the form /^[A-Za-z0-9]*$/, minus the bugs around the peculiar notion of
“numeric string” (e.g., "9E1" equivalent to 90).—Claude
Classes and methods is the expected way of implementing standard library in
an OO language. New APIs (such as the new Random api) use OO instead of
functions and it makes more sense to use OO in this case too: there's
likely a place for other methods too, like toBase(int $otherBase) etc. It
would also be possible to use overloaded operators if needed.
Le 18 janv. 2023 à 19:33, Alex Wells autaut03@gmail.com a écrit :
Classes and methods is the expected way of implementing standard library in an OO language. New APIs (such as the new Random api) use OO instead of functions and it makes more sense to use OO in this case too: there's likely a place for other methods too, like toBase(int $otherBase) etc. It would also be possible to use overloaded operators if needed.
Fortunately, PHP is not (yet) a language where every problem requires the use and manipulation of objects implementing a generalised and unified solution. I guess that the OO way of writing:
function next_alpha_id(): string {
static $x = 'zz';
return ++$x;
}
function next_num_id(): int {
static $x = 0;
return ++$x;
}
$my_id = next_alpha_id();
$my_other_id = next_num_id();
would resemble to the following, except that mixed
should be replaced by the use of generics. For brevity, I left the mandatory interfaces as exercise to the reader.
class IdGenerator {
protected mixed $current;
public function __construct(
protected readonly IdGeneratorType $type
, protected readonly IdGeneratorDirection $direction
, mixed $start
) {
$this->current = $start;
}
public function `next()`: mixed {
// implementation...
}
}
enum IdGeneratorType {
case alphabetic;
case numeric;
}
enum IdGeneratorDirection {
case positive;
case negative;
}
final class StandardGlobalAlphabeticIdGenerator {
private static IdGenerator $id_generator;
public static function get(): IdGenerator {
return self::$id_generator ?? new IdGenerator(
type: IdGeneratorType::alphabetic
, direction: IdGeneratorDirection::positive
, start: 'aaa'
);
}
}
final class StandardGlobalNumericIdGenerator {
private static IdGenerator $id_generator;
public static function get(): IdGenerator {
return self::$id_generator ?? new IdGenerator(
type: IdGeneratorType::numeric
, direction: IdGeneratorDirection::positive
, start: 1
);
}
}
$my_id = StandardGlobalAlphabeticIdGenerator::get()->next();
$my_other_id = StandardGlobalNumericIdGenerator::get()->next();
—Claude
On Wed, Jan 18, 2023 at 10:09 PM Claude Pache claude.pache@gmail.com
wrote:
Le 18 janv. 2023 à 19:33, Alex Wells autaut03@gmail.com a écrit :
Classes and methods is the expected way of implementing standard library
in an OO language. New APIs (such as the new Random api) use OO instead of
functions and it makes more sense to use OO in this case too: there's
likely a place for other methods too, like toBase(int $otherBase) etc. It
would also be possible to use overloaded operators if needed.Fortunately, PHP is not (yet) a language where every problem requires the
use and manipulation of objects implementing a generalised and unified
solution. I guess that the OO way of writing:function next_alpha_id(): string { static $x = 'zz'; return ++$x; } function next_num_id(): int { static $x = 0; return ++$x; } $my_id = next_alpha_id(); $my_other_id = next_num_id();
would resemble to the following, except that
mixed
should be replaced by
the use of generics. For brevity, I left the mandatory interfaces as
exercise to the reader.class IdGenerator { protected mixed $current; public function __construct( protected readonly IdGeneratorType $type , protected readonly IdGeneratorDirection $direction , mixed $start ) { $this->current = $start; } public function `next()`: mixed { // implementation... } } enum IdGeneratorType { case alphabetic; case numeric; } enum IdGeneratorDirection { case positive; case negative; } final class StandardGlobalAlphabeticIdGenerator { private static IdGenerator $id_generator; public static function get(): IdGenerator { return self::$id_generator ?? new IdGenerator( type: IdGeneratorType::alphabetic , direction: IdGeneratorDirection::positive , start: 'aaa' ); } } final class StandardGlobalNumericIdGenerator { private static IdGenerator $id_generator; public static function get(): IdGenerator { return self::$id_generator ?? new IdGenerator( type: IdGeneratorType::numeric , direction: IdGeneratorDirection::positive , start: 1 ); } } $my_id = StandardGlobalAlphabeticIdGenerator::get()->next(); $my_other_id = StandardGlobalNumericIdGenerator::get()->next();
—Claude
You've overcomplicated your example, but the truth is that there's
sometimes a reason behind the complexity you might see. In this case your
functions will work right until you need two independent sequences, or you
need a backwards direction, or you need a sequence that can be seeded with
a "starting point". The point is not to have the "OO way" because someone
likes it, the point is to provide more flexibility, one that is likely to
be needed down the road.
In this case, a class like Number that accepts a specific $base in the
constructor will not hurt. There are definitely use cases for custom base
numbers as mentioned above and a class is just a better way of doing it
than trying to combine multiple concepts (base-10 ints, text strings and
custom base numbers) under one or two types and a bunch of functions on
top.
When I read the RFC I was a little sceptical about the deprecation of
string increment functionality. It's something I used in the past and I see
no easy upgrade path. However, after reading this thread and thinking it
over, I realize that deprecation is the right way to go. Someone said that
it's useful when working with Excel. Excel uses bijective base-26 system.
PHP does not. I cannot even explain what logic governs PHP string increment
functionality.
The logic is actually fairly straightforward if you consider breaking
the original string into blocks of alpha, numeric and non-alphameric
characters; so a string like 'C-37AZ99' would be broken into five blocks
of characters ('C', '-', '37', 'AZ' and '99').
Start with the rightmost block, but only if its alpha or numeric: the
process will never increment any block of characters that is non-alphameric.
Increment the current block.
If that increment would result in an overflow (extending the size of
that block) and there is another block to the "left" in the chain of
blocks, then that block is reset to its "base" value (discard the
overflow character), and the same process is repeated for incrementing
the next block in the chain.
The process terminates when there are no more blocks in the chain, or
when the process encounters a non-alphameric block.
The string is then "glued" back together again for the return.
In this regard, when a block is alpha characters, then the increment
behaviour matches "Excel's bijective base-26".
--
Mark Baker
Le 17 janv. 2023 à 15:28, G. P. B. george.banyard@gmail.com a écrit :
Hello Internals,
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operators
Hi,
Adding a str_increment(...)
function that roughly replicates the current behaviour of ++$x
for non-numeric strings, is necessary and sufficient in order to have a simple and clear path forward for those that use the feature, thanks. (On the other hand, a str_decrement(...)
equivalent is not useful for the scope of this RFC.)
—Claude
Hello Internals,
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operatorsThe goal of this RFC is to reduce language complexity by making $v++ behave
like $v += 1 and $v-- behave like $v -= 1;I am expecting the contentious part of the proposal to be the deprecation
of the PERL string increment feature to achieve the aforementioned goal.
However, I believe the benefits of aligning the behaviour of the
increment/decrement operators with addition/subtraction are larger than
keeping support for the PERL increment, which in its current state has
various shortcomings.Best regards,
George P. Banyard
I don't see a section in the RFC about JIT or anything related to OpCache,
but I know from experience with the Operator Overloads RFC that there are
several architecture specific assembly optimizations for ++ and --. Have
these been considered, and how will they be impacted?
Jordan
I don't see a section in the RFC about JIT or anything related to OpCache,
but I know from experience with the Operator Overloads RFC that there are
several architecture specific assembly optimizations for ++ and --. Have
these been considered, and how will they be impacted?
The only assembly specific code is for integer increments/decrements, as
those are not affected by this RFC there is no impact to them.
Best regards,
George P. Banyard
Hello Internals,
I would like to start the discussion on the Path to Saner
Increment/Decrement operators RFC:
https://wiki.php.net/rfc/saner-inc-dec-operatorsThe goal of this RFC is to reduce language complexity by making $v++
behave like $v += 1 and $v-- behave like $v -= 1;I am expecting the contentious part of the proposal to be the deprecation
of the PERL string increment feature to achieve the aforementioned goal.
However, I believe the benefits of aligning the behaviour of the
increment/decrement operators with addition/subtraction are larger than
keeping support for the PERL increment, which in its current state has
various shortcomings.
I have added a section about impact on the PERL increment deprecation to
the RFC.
https://wiki.php.net/rfc/saner-inc-dec-operators
I am also planning on opening the vote on this on Wednesday the 28th of
June.
Best regards,
George P. Banyard