Hello Internals,
I’d like to present this new RFC. When discussing the issue, we first
thought that the RFC process wasn’t necessary. However, discussions on the
PR showed that selecting new letters for pack and unpack is more
challenging than we initially thought, thus creating an RFC for this change.
Here is the link to the RFC:
https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support
Best,
Alexandre Daubois
Hello Internals,
I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.
Here is the link to the RFC: https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support
Best,
Alexandre Daubois
Hi Alexandre,
Thank you for your work on this. Of all the RFCs I've seen in awhile, this one is one that I'm most excited to see after writing a protobuf implementation.
If there is one thing I would be over the moon for, it would be for also adding zigzag encoding as a possible signed integer encoding (maybe using Z/z as the letter)? It is more efficient for signed integers (vs. twos-complement) where a variable length integer is desired. I can understand if this is out of scope, but I thought I'd ask.
— Rob
Hi Rob,
Thank you for your work on this. Of all the RFCs I've seen in awhile, this one is one that I'm most excited to see after writing a protobuf implementation.
Happy to see you immediately have a use case in mind while reading this!
If there is one thing I would be over the moon for, it would be for also adding zigzag encoding as a possible signed integer encoding (maybe using Z/z as the letter)?
I wasn't aware this encoding existed. I'm happy to learn about it!
Unfortunately, Z is already used with pack/unpack. After a quick
research, I think it would be a nice feature. I also noticed that
languages like Rust and Python use external packages for this it
seems. I guess it would be better to have dedicated functions. I don't
think it should be included in this RFC, but that is an interesting
idea nevertheless. I'd be really interested if someone comes up with
an RFC for this feature!
— Alexandre Daubois
Hi
Am 2025-09-16 16:10, schrieb Alexandre Daubois:
seems. I guess it would be better to have dedicated functions. I don't
think it should be included in this RFC, but that is an interesting
idea nevertheless. I'd be really interested if someone comes up with
an RFC for this feature!
A better pack() with a streamlined format “description” would probably
fit with Ignace's proposed new Encoding extension:
https://externals.io/message/127716#127716
Best regards
Tim Düsterhus
Hi
Am 2025-09-16 13:45, schrieb Alexandre Daubois:
Here is the link to the RFC:
https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support
Thank you for the RFC. I'm confused by the “Why Perl's Approach Cannot
Be Used in PHP” section.
- Base Letters Already Taken
The point of a modifier is to modify something. That means that there
needs to be a “base letter”. The same base letters are also “taken” in
Perl and have the same definition.
- Parser Architecture Limitations
That sounds like a simple problem to solve. When reaching one of the
“base letters” in question, look at the next character.
- Different Design Philosophy
This is simply false. v/n/V/N identically exist in Perl. J is not clear
to me, and P appears to be different (but I don't do enough Perl to say
for certain).
Best regards
Tim Düsterhus
Hi,
- Base Letters Already Taken
The point of a modifier is to modify something. That means that there
needs to be a “base letter”. The same base letters are also “taken” in
Perl and have the same definition.
- Parser Architecture Limitations
That sounds like a simple problem to solve. When reaching one of the
“base letters” in question, look at the next character.
Indeed, it is just that I'm not sure that it is worth adding modifiers
support to pack and unpack as with this addition, most (if not all)
cases should be covered then.
This is simply false. v/n/V/N identically exist in Perl. J is not clear
to me, and P appears to be different (but I don't do enough Perl to say
for certain).
Perl and PHP share common letters, you are right. But looking at the
table of each language, there are many differences. However, I may
reword it as I realized that the RFC states that differences appear
with specific endianness letters (and you showed that it's not true).
There are many differences when we're not talking about endian
specific formats actually.
— Alexandre Daubois
Hi
Am 2025-09-16 16:50, schrieb Alexandre Daubois:
Indeed, it is just that I'm not sure that it is worth adding modifiers
support to pack and unpack as with this addition, most (if not all)
cases should be covered then.
I don't think it is worth blocking more letters that are also
incompatible with the language where pack() was “borrowed” from.
This is simply false. v/n/V/N identically exist in Perl. J is not
clear
to me, and P appears to be different (but I don't do enough Perl to
say
for certain).Perl and PHP share common letters, you are right. But looking at the
They don't just “share common letters”. The pack() function is directly
coming from Perl and that is also documented:
The idea for this function was taken from Perl and all formatting codes
work the same as in Perl. However, there are some formatting codes that
are missing such as Perl's "u" format code.
.
table of each language, there are many differences. However, I may
reword it as I realized that the RFC states that differences appear
with specific endianness letters (and you showed that it's not true).
There are many differences when we're not talking about endian
specific formats actually.
I'm not sure if “all formatting codes work the same” is still 100%
accurate (due to J and P), but “many differences” is definitely false.
I'm seeing the following differences:
- J and P might or might not be different.
- e, E, g, and G don't exist in Perl (it would be d<, d>, f<, and f>
respectively; these format specifiers could be deprecated if this RFC
ships).
Both w and W are already taken in Perl and would actually be a
difference.
Best regards
Tim Düsterhus
Hi,
I'm not sure if “all formatting codes work the same” is still 100%
accurate (due to J and P), but “many differences” is definitely false.
I'm seeing the following differences:
- J and P might or might not be different.
- e, E, g, and G don't exist in Perl (it would be d<, d>, f<, and f>
respectively; these format specifiers could be deprecated if this RFC
ships).Both w and W are already taken in Perl and would actually be a
difference.
I'm not sure to exactly understand what you mean. Do you propose to
introduce < and > modifiers and deprecate some letters, if I get it
right?
— Alexandre Daubois
Hi
Am 2025-09-17 09:09, schrieb Alexandre Daubois:
I'm not sure to exactly understand what you mean. Do you propose to
introduce < and > modifiers and deprecate some letters, if I get it
right?
I'm proposing to introduce the < and > modifiers, following Perl's
lead. This would then allow to deprecate e, E, g, and G in a future
version and if / when that happens, PHP would be in sync with Perl
again. I'm not proposing that the deprecation should happen at the same
time, because folks should have at least one PHP version to migrate
where the new logic is available, but the old is not yet deprecated.
Best regards
Tim Düsterhus
Hi,
I'm proposing to introduce the
<and>modifiers, following Perl's
lead. This would then allow to deprecate e, E, g, and G in a future
version and if / when that happens, PHP would be in sync with Perl
again. I'm not proposing that the deprecation should happen at the same
time, because folks should have at least one PHP version to migrate
where the new logic is available, but the old is not yet deprecated.
That would be an interesting addition. Given that only those 4 have
diverged from Perl, that could be great to sync them again. Let's wait
for other inputs to see if someone agrees with the idea before
potentially revamping the RFC. I personally like it.
— Alexandre Daubois
Hi,
I'm proposing to introduce the
<and>modifiers, following Perl's
lead. This would then allow to deprecate e, E, g, and G in a future
version and if / when that happens, PHP would be in sync with Perl
again. I'm not proposing that the deprecation should happen at the same
time, because folks should have at least one PHP version to migrate
where the new logic is available, but the old is not yet deprecated.That would be an interesting addition. Given that only those 4 have
diverged from Perl, that could be great to sync them again. Let's wait
for other inputs to see if someone agrees with the idea before
potentially revamping the RFC. I personally like it.— Alexandre Daubois
Hi,
I like the idea of following Perl, but also I have some concerns:
- It will make it hard or impossible to justify adding something to PHP that isn’t in Perl. The fact that they’re different is a boon and a curse. That being said, nobody has really changed this or proposed anything PHP specific in a long time.
- This is a pretty breaking change, so will it have to land in PHP 9.0
- Which begs the question, what will be the next version of PHP? 9.0 or 8.6?
— Rob
Hi Rob,
It will make it hard or impossible to justify adding something to PHP that isn’t in Perl. The fact that they’re different is a boon and a curse. That being said, nobody has really changed this or proposed anything PHP specific in a long time.
That's right. However, if I'm correct, the proposed change in this RFC
should be the last addition as all types of data and endianness
should be supported. But I get your point indeed, it would set a
"legal precedent" on something that may not need one.
Which begs the question, what will be the next version of PHP? 9.0 or 8.6?
The next version is likely to be 8.6. The 8.5 branch has been cut this
week and the NEWS file now targets 8.6 on master when we (up)merge
changes.
— Alexandre Daubois
That would be an interesting addition. Given that only those 4 have
diverged from Perl, that could be great to sync them again. Let's wait
for other inputs to see if someone agrees with the idea before
potentially revamping the RFC. I personally like it.
PHP's unpack() is already fundamentally divergent from Perl's in that it returns an associative array, and requires its format string to be formatted differently from the one used by pack(). For instance, in Perl, one can write:
$packed = pack("NNN", 123, 456, 789);
($x, $y, $z) = unpack("NNN", $packed);
and have $x = 123, $y = 456, and $z = 789.
In PHP, however, the following code does not work as expected:
$packed = pack("NNN", 123, 456, 789);
[$x, $y, $z] = unpack("NNN", $packed);
Instead of returning a list of three integers, it returns an associative array with one key called "NN"; the other two values aren't unpacked at all. Instead, one must name each of the fields in the format string and unpack them as an associative array:
["x" => $x, "y" => $y, "z" => $z] = unpack("Nx/Ny/Nz", $packed);
I'd support changes to pack() and/or unpack() to allow them to be used symmetrically, but that seems like a larger change than what is being proposed here.
-- Andrew F
Am 16.09.2025 um 13:45 schrieb Alexandre Daubois alex.daubois+php@gmail.com:
I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.
Here is the link to the RFC: https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support
Just a little side-note: unpackToSignedInt can be implemented more easily, e.g. like
function unpackToSignedInt($bytes)
{
return ($uint32 = unpack('V', $bytes)[1]) < 2 ** 31 ? $uint32 : $uint32 - 2 ** 32; # Use 'N' for big endian
}
While I undestand the wish for having all different sizes in all different endian-ness I don't think it is that big of an issue for people dealing with binary data, so I'm +-0 whether PHP really needs it.
Regards,
- Chris
Hi,
I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.
Here is the link to the RFC: https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support
After the discussions about this RFC as well as taking a step back for
a couple of weeks, I'm withdrawing the RFC.
I think the initial idea is good, but ultimately its usefulness seems
too limited to me as well, and methods such as the one Nicolas
mentions with “0 + $var” seem sufficient.
Thanks to everyone who participated in the discussion!
— Alexandre Daubois
Hi,
I think the initial idea is good, but ultimately its usefulness seems
too limited to me as well, and methods such as the one Nicolas
mentions with “0 + $var” seem sufficient.
I'm sorry, this is the wrong thread. Here is the RFC being withdrawn:
https://wiki.php.net/rfc/is-representable-as-float-int
— Alexandre Daubois
Hi folks,
I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.
Here is the link to the RFC: https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support
I'm getting back to this RFC. Here's the current status: it's been
proposed to implement Perl modifiers to pack/unpack. I think this is a
bit overkill in this case, as this RFC proposes to add the very last
missing formats to the two functions. Once these new formats are
added, the list is complete.
If no concerns are raised in the next few days, I'll open this RFC to
the vote in its current state.
Thanks,
— Alexandre Daubois
Hi
Am 2025-10-31 10:29, schrieb Alexandre Daubois:
If no concerns are raised in the next few days, I'll open this RFC to
the vote in its current state.
I will vote against the RFC in its current state for the reasons that I
outlined before.
It's needlessly diverging from Perl by inventing new letters that are
not even internally consistent with the existing letters (lowercase vs
uppercase is 16 bit vs 32 bit, not LE vs BE). It is also still making
the false statement that Perl's approach doesn't work for PHP when we
established during the discussion that it does work, if we want it to
work. It's also needlessly confusing the reader by including information
about a PR #19368 implementation that can easily be mistaken for
something that is already an established fact rather than a proposal.
Best regards
Tim Düsterhus
Hi Tim,
I will vote against the RFC in its current state for the reasons that I
outlined before.
Given the previous exchanges, I totally get it. Thank you for taking
time to discuss this topic!
It's needlessly diverging from Perl by inventing new letters that are
not even internally consistent with the existing letters (lowercase vs
uppercase is 16 bit vs 32 bit, not LE vs BE). It is also still making
the false statement that Perl's approach doesn't work for PHP when we
established during the discussion that it does work, if we want it to
work. It's also needlessly confusing the reader by including information
about a PR #19368 implementation that can easily be mistaken for
something that is already an established fact rather than a proposal.
I reworked the wording a bit and labeled the implementation as
"proposed PR" instead of "current PR" to reduce potential confusion.
Best,
— Alexandre Daubois
Hi
Am 2025-10-31 14:27, schrieb Alexandre Daubois:
I reworked the wording a bit and labeled the implementation as
"proposed PR" instead of "current PR" to reduce potential confusion.
This is not resolving the factual issues with the RFC. The “Why Perl's
Approach Is Not The Best Fit For PHP” section still contains the
incorrect statements that I previously pointed out in my email on
September 16th: https://news-web.php.net/php.internals/128705. With
regard to "(2) Parser Architecture Limitations" specifically, please see
my previous reply to Gina.
With regard to the “Considered Alternatives”, it is also not clear to me
what “complex migration path” there should be. Supporting specific
endianess for signed integers is a new feature. There is no migration
path.
Best regards
Tim Düsterhus
Hi
Am 2025-10-31 10:29, schrieb Alexandre Daubois:
If no concerns are raised in the next few days, I'll open this RFC to
the vote in its current state.I will vote against the RFC in its current state for the reasons that I
outlined before.It's needlessly diverging from Perl by inventing new letters that are
not even internally consistent with the existing letters (lowercase vs
uppercase is 16 bit vs 32 bit, not LE vs BE). It is also still making
the false statement that Perl's approach doesn't work for PHP when we
established during the discussion that it does work, if we want it to
work. It's also needlessly confusing the reader by including information
about a PR #19368 implementation that can easily be mistaken for
something that is already an established fact rather than a proposal.Best regards
Tim Düsterhus
We do use lowercase/uppercase as an indicator for BE/LE for float specifiers, something Perl does not support.
However, the h (and b in Perl) specifiers use the case to indicate if it is the low or high nibble first.
Moreover, Perl and PHP primarily use cases to distinguish between signed and unsigned for integer specifiers, namely the s, l, q, i (and j for Perl) specifiers.
Only the n and v specifiers use uppercase to mean 32 bit instead of 16.
So there isn't a true "meaning" behind what upper and lowercase means, be that in Perl or in PHP.
While the < > syntax to "force" the endianess of a sequence specifier is nice.
But if this requires rewriting the whole parser as this RFC implies, then you are asking someone to commit to a larger amount of work than they signed up, which is considered bad RFC etiquette. [1]
Still stating your reason for rejecting an RFC is a nice heads up.
The strongest argument for introducing the < and > specifiers is that Python [2] does it that way, but a key distinction is that it is the only way to specify endianess.
And if this would be a new parser/API we might as well do other changes such as using a + or - specifier to indicate signedness, and add grouping using () like Perl.
Overall I am +-0 on this RFC because I don't use those functions, so I'll abstain from it.
Best regards,
Gina P. Banyard
[1] https://github.com/Danack/RfcCodex/blob/master/etiquette/rfc_etiquette.md#dont-volunteer-other-people-for-huge-amounts-of-work
[2] https://docs.python.org/3/library/struct.html#struct-alignment
Hi
Am 2025-11-03 15:51, schrieb Gina P. Banyard:
While the < > syntax to "force" the endianess of a sequence specifier
is nice.
But if this requires rewriting the whole parser as this RFC implies,
then you are asking someone to commit to a larger amount of work than
they signed up, which is considered bad RFC etiquette. [1]
I disagree with that claim in the RFC and to put my money where my mouth
is, I have spent the 15 minutes of writing the necessary patch for the
pack() function. It is attached to this email and also available as this
gist: https://gist.github.com/TimWolla/d8bca56a6507226e684827d2a7b44829.
Given the time spent, I've only given it light testing, but it passes
all existing pack() tests and returns the correct output for:
<?php
var_dump(bin2hex(pack('s<2s>2', 258, -2, 258, -2)));
var_dump(bin2hex(pack('a>', 258)));
Using perl -e "print pack('s<2s>2', 258, -2, 258, -2)" |xxd as a
comparison. I have not created the patch for unpack(), but I believe
this is already sufficient demonstration that “rewriting the whole
parser” is not necessary at all.
Best regards
Tim Düsterhus
Hi
Am 2025-11-03 15:51, schrieb Gina P. Banyard:
While the < > syntax to "force" the endianess of a sequence specifier
is nice.
But if this requires rewriting the whole parser as this RFC implies,
then you are asking someone to commit to a larger amount of work than
they signed up, which is considered bad RFC etiquette. [1]I disagree with that claim in the RFC and to put my money where my mouth
is, I have spent the 15 minutes of writing the necessary patch for the
pack()function. It is attached to this email and also available as this
gist: https://gist.github.com/TimWolla/d8bca56a6507226e684827d2a7b44829.
Given the time spent, I've only given it light testing, but it passes
all existingpack()tests and returns the correct output for:<?php var_dump(bin2hex(pack('s<2s>2', 258, -2, 258, -2))); var_dump(bin2hex(pack('a>', 258)));Using
perl -e "print pack('s<2s>2', 258, -2, 258, -2)" |xxdas a
comparison. I have not created the patch forunpack(), but I believe
this is already sufficient demonstration that “rewriting the whole
parser” is not necessary at all.Best regards
Tim DüsterhusAttachments:
• 0001-pack-Support-endian-specifier.patch
Please don’t do this.
For those of us using pack()/unpack(), I don’t really care how much like or unlike Perl it is, and having to switch strings based on php version because someone wanted it like Perl sounds like a special kind of hell. It’s already tricky enough to get pack/unpack right when dealing with binary data and having to do it twice plus maintain two different versions of the same string… no thank you.
It’s also used subtly in all kinds of unexpected places (totp calculations, encryption polyfills, etc). This kind of change would almost necessitate a major version bump of php.
— Rob
Please don’t do this.
For those of us using
pack()/unpack(), I don’t really care how much like or unlike Perl it is, and having to switch strings based on php version because someone wanted it like Perl sounds like a special kind of hell. It’s already tricky enough to get pack/unpack right when dealing with binary data and having to do it twice plus maintain two different versions of the same string… no thank you.
AFAIU the old way of doing things won't break with Tim's suggestion. So there's no need to switch strings.
It just adds the possibility of using <>.
I agree it's already tricky enough to get things right, which is exactly why Tim's approach is the right one. Instead of adding more arbitrarily chosen letters we now have a more meaningful way to indicate endianness. It also is proven by Tim's patch that this isn't hard to achieve. While implementation-wise adding some more letters is easier, Tim's patch isn't really difficult anyway.
I will vote against the RFC in its current form in favor of Tim's approach.
Kind regards
Niels
Hi,
It just adds the possibility of using <>.
I agree it's already tricky enough to get things right, which is exactly why Tim's approach is the right one. Instead of adding more arbitrarily chosen letters we now have a more meaningful way to indicate endianness. It also is proven by Tim's patch that this isn't hard to achieve. While implementation-wise adding some more letters is easier, Tim's patch isn't really difficult anyway.
So, if I get it right, you would both prefer a RFC proposing to add <
and > for letters using machine endianness, with no effect on other
letters (like Perl does)? I try to think about possible edge and error
cases before what I really think about this proposition. If you have
in mind tricky things that could be worth investigating deeper with
implementing modifiers, please let me know.
— Alexandre Daubois
Hi
Am 2025-11-05 09:57, schrieb Alexandre Daubois:
So, if I get it right, you would both prefer a RFC proposing to add <
and > for letters using machine endianness, with no effect on other
letters (like Perl does)? I try to think about possible edge and error
Correct. More specifically: The modifiers should emit an error for
unsupported letters instead of silently failing. This is what my
proof-of-concept patch already implements and it's in line with unknown
letters throwing:
php > var_dump(pack('?', 123));
PHP Warning: Uncaught ValueError: Type ?: unknown format code in
php shell code:1
Other than that, I can't think of any edge cases worth handling.
Best regards
Tim Düsterhus
Hi everyone,
I’d like to present this new RFC. When discussing the issue, we first thought that the RFC process wasn’t necessary. However, discussions on the PR showed that selecting new letters for pack and unpack is more challenging than we initially thought, thus creating an RFC for this change.
After rereading the threads and spending some time thinking about it
all, I propose a new version of this RFC aimed at adding Perl
modifiers. Indeed, this seems to be a better solution than the one
previously proposed, and several people seem to share this opinion.
The RFC URL is the same and its version has been bumped to 1.1:
https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-support
Looking forward to reading your feedback on this revision.
— Alexandre Daubois
Hi
After rereading the threads and spending some time thinking about it
all, I propose a new version of this RFC aimed at adding Perl
modifiers. Indeed, this seems to be a better solution than the one
previously proposed, and several people seem to share this opinion.The RFC URL is the same and its version has been bumped to 1.1:
https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-supportLooking forward to reading your feedback on this revision.
Thank you.
I only have one comment on
Initially, endianness modifiers will only be supported for signed integer format codes (s, l, q) since unsigned integers already have dedicated endian-specific letters.
While there are already dedicated alternatives, I feel that restricting
the new modifiers to the lowercase versions would be unnecessarily
restrictive. Since the RFC argues that:
- Intuitive semantics: The < and > symbols visually suggest byte order direction
which I agree with, the same argument applies to the uppercase QLS
versions. As a developer I would rather remember l> as "signed long
big-endian" and L> as "unsigned long big-endian" rather than N as
"4-byte network-byte order".
Since there is no inherent limitation or ambiguity with supporting
modifiers on QLS, I would suggest just allowing it. In fact I think my
PoC patch already supported them.
There's also a formatting issue of the “Rationale” in the “Proposed
Solution” section.
Best regards
Tim Düsterhus
Hi
Hi
After rereading the threads and spending some time thinking about it
all, I propose a new version of this RFC aimed at adding Perl
modifiers. Indeed, this seems to be a better solution than the one
previously proposed, and several people seem to share this opinion.The RFC URL is the same and its version has been bumped to 1.1:
https://wiki.php.net/rfc/pack-unpack-endianness-signed-integers-supportLooking forward to reading your feedback on this revision.
Thank you.
I only have one comment on
Initially, endianness modifiers will only be supported for signed
integer format codes (s, l, q) since unsigned integers already have
dedicated endian-specific letters.While there are already dedicated alternatives, I feel that
restricting the new modifiers to the lowercase versions would be
unnecessarily restrictive. Since the RFC argues that:
- Intuitive semantics: The < and > symbols visually suggest byte
order directionwhich I agree with, the same argument applies to the uppercase QLS
versions. As a developer I would rather remember l> as "signed long
big-endian" and L> as "unsigned long big-endian" rather than N as
"4-byte network-byte order".Since there is no inherent limitation or ambiguity with supporting
modifiers on QLS, I would suggest just allowing it. In fact I think my
PoC patch already supported them.
I agree with Tim here and have a follow up question ...
Quoting the docs from Perl, it's also supported to use <> modifiers on
floating point values but I haven't found any note about it in your RFC.
In my opinion it makes sense to allow these modifiers on fd as well for
the same reasons as QLS.
<snip> > * Real numbers (floats and doubles) are in native machine format only. Due to the multiplicity of floating-point formats and the lack of a standard "network" representation for them, no facility for interchange has been made. This means that packed floating-point data written on one machine may not be readable on another, even if both use IEEE floating-point arithmetic (because the endianness of the memory representation is not part of the IEEE spec). See also perlport <https://perldoc.perl.org/perlport>. If you know /exactly/ what you're doing, you can use the |>| or |<| modifiers to force big- or little-endian byte-order on floating-point values.
- Also floating point numbers have endianness. Usually (but not
always) this agrees with the integer endianness. Even though most
platforms these days use the IEEE 754 binary format, there are
differences, especially if the long doubles are involved. You can see
the |Config| variables |doublekind| and |longdblkind| (also
|doublesize|, |longdblsize|): the "kind" values are enums, unlike
|byteorder|. Portability-wise the best option is probably to keep to the
IEEE 754 64-bit doubles, and of agreed-upon endianness. Another
possibility is the |"%a"|) format of |printf|
https://perldoc.perl.org/functions/printf.- Starting with Perl 5.10.0, integer and floating-point formats,
along with the |p| and |P| formats and |()| groups, may all be followed
by the |>| or |<| endianness modifiers to respectively enforce big- or
little-endian byte-order. These modifiers are especially useful given
how |n|, |N|, |v|, and |V| don't cover signed integers, 64-bit integers,
or floating-point values.
Hello Marc,
Le dim. 23 nov. 2025 à 18:04, Marc B. marc@mabe.berlin a écrit :
Quoting the docs from Perl, it's also supported to use <> modifiers on floating point values but I haven't found any note about it in your RFC. In my opinion it makes sense to allow these modifiers on fd as well for the same reasons as QLS.
Thanks for this information! I think that it would make sense. I added
this to the future scope section of the RFC.
While I'm eager to go deeper into the floating points topic with
pack/unpack, I feel that it would deserve a follow-up RFC so this one
doesn't grow too much. This one's focus on integers as its title and
URL suggest, but the core feature is actually adding support for
modifiers now. In the scenario of this one being accepted, we would
have plenty of time to create a follow-up and implement it (especially
since modifiers would have already been accepted).
— Alexandre Daubois
Hi Tim,
Le dim. 23 nov. 2025 à 15:45, Tim Düsterhus tim@bastelstu.be a écrit :
Initially, endianness modifiers will only be supported for signed integer format codes (s, l, q) since unsigned integers already have dedicated endian-specific letters.
While there are already dedicated alternatives, I feel that restricting
the new modifiers to the lowercase versions would be unnecessarily
restrictive. Since the RFC argues that:
- Intuitive semantics: The < and > symbols visually suggest byte order direction
which I agree with, the same argument applies to the uppercase QLS
versions. As a developer I would rather remember l> as "signed long
big-endian" and L> as "unsigned long big-endian" rather than N as
"4-byte network-byte order".Since there is no inherent limitation or ambiguity with supporting
modifiers on QLS, I would suggest just allowing it. In fact I think my
PoC patch already supported them.
I agree. I just updated the text and tables to reflect the addition of
big and little endian unsigned integers throughout the document.
There's also a formatting issue of the “Rationale” in the “Proposed
Solution” section.
The text has been cleaned and simplified. Thanks!
— Alexandre Daubois
Hi
I agree. I just updated the text and tables to reflect the addition of
big and little endian unsigned integers throughout the document.
Thank you. In the “Complete PHP Format Letter Organization:” table you
could also add “Unsigned machine-endian" for completeness (i.e. the
uppercase QLS without modifier).
Other than that, I don't have further comments. The RFC LGTM.
Best regards
Tim Düsterhus
Hi Tim,
Le mar. 25 nov. 2025 à 23:17, Tim Düsterhus tim@bastelstu.be a écrit :
Thank you. In the “Complete PHP Format Letter Organization:” table you
could also add “Unsigned machine-endian" for completeness (i.e. the
uppercase QLS without modifier).
RFC updated with the new table row. Thanks!
— Alexandre Daubois