Hi internals,
I propose adding a preserve_key_types parameter to array_keys()
to address
issues caused by automatic conversion of string-numeric keys to integers.
Developer Experience Considerations
While PHP's automatic conversion of numeric string keys to integers is
documented behavior, real-world usage shows this feature continues to cause
unexpected issues:
Cognitive Burden
As observed in a Reddit discussion:
"We understand the type coercion rules, but when converting '123' to 123
still catches us off guard when debugging some problems."
As we mentioned earlier, the array_keys methodProduction Risks
The implicit conversion creates hidden pitfalls:
php
复制
// Cache system failure example$cache = ["123" => "data"]; // Redis
expects string keys$keys = array_keys($cache); // Returns [123]
(int)$redis->get($keys[0]); // Fails silently
Debugging Costs
Issues manifest only at runtime, requiring:
- Additional type validation code
- Defensive programming with array_map('strval', ...)
- Increased bug investigation time
Problem ExamplesDatabase Performance Issues
php
$orderIds = ["1001", "1002"]; // VARCHAR keys in database$keys =
array_keys($orderIds); // [1001, 1002] (unexpected
int)$db->query("SELECT * FROM orders WHERE id IN
(".implode(',',$keys).")");
→ May cause full table scans when VARCHAR indexes are ignored.
Redis Cache Failures
php
$cacheData = ["user:1001" => "data", "user:1002" => "data"];$keys =
array_keys($cacheData); // ["user:1001", "user:1002"]
(correct)$numericData = ["1001" => "data", "1002" => "data"];
$numericKeys = array_keys($numericData); // [1001, 1002] (converted to
int)$redis->mget(array_merge($keys, $numericKeys)); // Partial failure
→ Mixed key types cause silent cache misses.
Proposal
Add a 4th parameter:
php
array_keys(
array $array,
mixed $search_value = null,
bool $strict = false,
bool $preserve_key_types = false
): array
When true, maintains original key types.
Questions for Discussion
Design Considerations
- Should we provide a way to opt-out of this automatic conversion?
- Would a new parameter be preferable to a separate function (e.g.
array_keys_preserve())?
- Would a new parameter be preferable to a separate function (e.g.
Use Case Validation
- Are the database and Redis examples sufficient to justify this change?
- Are there other common use cases we should consider?
Implementation Considerations
- Should we consider making this the default behavior in a future major
version?- Are there performance implications we should evaluate?
Looking forward to your feedback.
Best regards,
[xiaoma]
As far as I know, this is how array keys are designed and changing
array_keys wouldn't solve the problem. The key is converted to the
appropriate type during creation. Array keys are defined as string|int
and expecting the values to be either string or int is a programming
error. Static analysis tools should catch that out.
If a developer remembers to use preserve_key_types then they would
also remember that the key can be string|int and would design the rest
of the code accordingly. So it seems to me like this solution is not
the right one.
I sympathise with the problem and I would prefer that the array keys
keep the type with which they were created. I have seen so many times
when a developer did something like this:
$arr = ['123' => 'foo'];
foreach ($arr as $key => $index) {
echo htmlentities($key);
}
Do I remember correctly that it was because of backwards compatibility
issues that this could not be changed?
As Kamil mentioned, this is not limited to a single function but to
array type as a whole. See previous discussion
https://externals.io/message/116735.
I propose adding a |preserve_key_types| parameter to |array_keys()| to
address issues caused by automatic conversion of string-numeric keys
to integers.
I've carefully read the discussion at [https://externals.io/message/116735].
While I understand the historical reasons make it difficult to directly
change array behavior, this automatic conversion issue does confuse many
PHP developers and needs to be addressed. I'd like to propose two solutions:
First, we could add a parameter like preserve_key_types to array
functions such as array_keys()
/array_search() to temporarily handle the
implicit conversion issue.
2.
Alternatively, we could introduce a new data type "Map" (inspired by
Java and other languages) to completely solve this conversion problem:
php
复制
// Declaration (preserves original key types)$map = new Map(['01' =>
'a', '10' => 'b']); // Explicit declaration, keys "01" and "10" won't
convert to int$map = {"01"=>111, "02"=>222}; // Syntactic sugar for
the above line, similar to array's [] syntax
This Map type would support conversion to/from traditional arrays:
php
复制
$array = [1, 2, 3];$map = Map::fromArray($array); // Explicit conversion
$newArray = $map->toArray(); // Convert back to traditional array
During iteration, keys would maintain their original types:
php
复制
foreach ($map as $key => $value) {
// $key preserves its original type (string/int)
}
Existing functions like array_keys()
/array_search() could accept Map
parameters while maintaining all other logic identical to traditional
arrays, except keys wouldn't be implicitly converted.
[xiaoma]
Daikaras webmaster@daikaras.lt 于2025年5月27日周二 15:42写道:
As Kamil mentioned, this is not limited to a single function but to array
type as a whole. See previous discussion
https://externals.io/message/116735.I propose adding a preserve_key_types parameter to
array_keys()
to
address issues caused by automatic conversion of string-numeric keys to
integers.
First, we could add a parameter like preserve_key_types to array
functions such asarray_keys()
/array_search() to temporarily handle the
implicit conversion issue.
This will not help. The keys are changed when they are written to the array, not when they are read back out. No option to array_keys can tell you whether the key 42 was originally set as '42', because that information is not stored anywhere.
Alternatively, we could introduce a new data type "Map"
Yes, I think this was suggested a couple of times on the previous thread. It would be a useful feature, but probably not easy to implement efficiently and integrate thoroughly into the language.
Regards,
Rowan Tommins
[IMSoP]
Hi Rowan,
Thank you for the clear technical guidance. I agree we've reached a
consensus that implementing a native Map type would be the proper solution
to this long-standing issue.
To move forward, I'd like to clarify a few practical aspects:
-
RFC Proposal
- Should I initiate an RFC draft at this stage?
- If yes, would you recommend starting with an "idea" thread on
internals@lists.php.net first?
-
Implementation Resources
- Given my limited experience with PHP core development:
- What would be the minimal viable prototype to demonstrate
feasibility? - Are there active contributors you could refer who might be
interested in collaborating?
- What would be the minimal viable prototype to demonstrate
- Given my limited experience with PHP core development:
-
Interim Steps
- Would it help to:
- Compile real-world use cases from major frameworks?
- Benchmark existing userland implementations (e.g., DS\Map)?
- Would it help to:
I'm committed to driving this improvement and willing to coordinate
non-code efforts. Your advice on the most effective next steps would be
invaluable.
Best regards,
[xiaoma]
Rowan Tommins [IMSoP] imsop.php@rwec.co.uk 于2025年5月29日周四 00:43写道:
First, we could add a parameter like preserve_key_types to array
functions such asarray_keys()
/array_search() to temporarily handle the
implicit conversion issue.This will not help. The keys are changed when they are written to the
array, not when they are read back out. No option to array_keys can tell
you whether the key 42 was originally set as '42', because that information
is not stored anywhere.Alternatively, we could introduce a new data type "Map"
Yes, I think this was suggested a couple of times on the previous thread.
It would be a useful feature, but probably not easy to implement
efficiently and integrate thoroughly into the language.Regards,
Rowan Tommins
[IMSoP]
Hi Rowan,
Thank you for the clear technical guidance. I agree we've reached a
consensus that implementing a native Map type would be the proper
solution to this long-standing issue.To move forward, I'd like to clarify a few practical aspects:
RFC Proposal
- Should I initiate an RFC draft at this stage?
- If yes, would you recommend starting with an "idea" thread on
internals@lists.php.net first?Implementation Resources
- Given my limited experience with PHP core development:
- What would be the minimal viable prototype to demonstrate
feasibility?- Are there active contributors you could refer who might be
interested in collaborating?Interim Steps
- Would it help to:
- Compile real-world use cases from major frameworks?
- Benchmark existing userland implementations (e.g., DS\Map)?
I'm committed to driving this improvement and willing to coordinate
non-code efforts. Your advice on the most effective next steps would be
invaluable.Best regards,
[xiaoma]
There's been a fair amount of discussion in the past, and I am 100% in favor of splitting list/sequence, set, and map types into separate types. However, there's some challenges to doing so, mainly around, of course, type variance and generics.
See:
https://thephp.foundation/blog/2024/08/19/state-of-generics-and-collections/#full-reified-generics
(The "Collections" section, specifically, and the link to my research document.)
Gina is working hard on an associated types RFC ("generics junior"), which would obviate the need for any custom syntax there. That would actually allow doing most of the collection implementation in user-space; the missing parts would be operators and being able to use more efficient C implementations. If Gina's RFC passes (it's not yet proposed, unfortunately), I intend to try and write user-space versions to flesh out the design if nothing else. Ideally I'd love for that to turn into an RFC in the future, but the bar for that is of course higher, and we'll have to see what else happens in the mean time.
I don't mean to dissuade you from looking into the topic; I just want you to be aware of the prior art so that we don't end up with multiple conflicting half-baked approaches instead of a single fully baked approach.
--Larry Garfield