Hi internals,
Voting has started on the ImmutableIterable RFC https://wiki.php.net/rfc/cachediterable
Previous discussion can be found in https://externals.io/message/114834
Recent changes:
- The name was renamed to
ImmutableIterable
to indicate that it cannot be changed after being constructed.
It was brought up in previous discussions that the previous name ofCachedIterable
could easily be assumed to have functionality similar to on-demand iterators/iterables such as https://php.net/cachingiterator
(Additionally, immutability is rare among spl data structures) -
__set_state
was added
Thanks,
Tyson
Hey Tyson,
http://ocramius.github.com/
On Tue, Jun 15, 2021 at 4:01 PM tyson andre tysonandre775@hotmail.com
wrote:
Hi internals,
Voting has started on the ImmutableIterable RFC
https://wiki.php.net/rfc/cachediterablePrevious discussion can be found in https://externals.io/message/114834
Recent changes:
- The name was renamed to
ImmutableIterable
to indicate that it cannot
be changed after being constructed.
It was brought up in previous discussions that the previous name of
CachedIterable
could easily be assumed to have functionality similar to
on-demand iterators/iterables such as https://php.net/cachingiterator
(Additionally, immutability is rare among spl data structures)__set_state
was added
I see the RFC as valuable but:
-
__serialize
and__unserialize
are out of scope: this depends on the
contents of it, and there's no point in implementing them -
__set_state
should also not be implemented:var_export()
like any
other object and it should be fine -
jsonSerialize
also depends on the contents, and shouldn't be exposed
All of this is not part of what should be in a reusable iterator.
Greets,
Marco Pivetta
Hi Marco Pivetta,
I see the RFC as valuable but:
*
__serialize
and__unserialize
are out of scope: this depends on the contents of it, and there's no point in implementing them
*__set_state
should also not be implemented:var_export()
like any other object and it should be fine
*jsonSerialize
also depends on the contents, and shouldn't be exposedAll of this is not part of what should be in a reusable iterator.
This is an IteratorAggregate implementation and all of the contents of the inner iterable are eagerly evaluated in the constructor.
I'd also considered names such as ImmutableIteratorAggregate, ImmutableKeyValueSequence or ImmutableEntrySet, but was unhappy with all of the names (excessively long, misleading, ambiguous, etc),
and prior discussion on the mailing list lead me to believe short names were widely preferred over long names https://externals.io/message/114834#114812
If it was lazily evaluated such as proposed in https://wiki.php.net/rfc/cachediterable#future_scope , I'd agree that __serialize
, __unserialize
, json_encode
, etc likely didn't belong in a lazily evaluated data structure,
but the goal was creating a reusable data structure
(e.g. that could be used to store key-value sequences from any source compactly (e.g. generators) and be serialized
and persisted to memcached, redis, a file, static array, etc)
Cheers,
Tyson
On Tue, Jun 15, 2021 at 4:01 PM tyson andre tysonandre775@hotmail.com
wrote:
Hi internals,
Voting has started on the ImmutableIterable RFC
https://wiki.php.net/rfc/cachediterablePrevious discussion can be found in https://externals.io/message/114834
Recent changes:
- The name was renamed to
ImmutableIterable
to indicate that it cannot
be changed after being constructed.
It was brought up in previous discussions that the previous name of
CachedIterable
could easily be assumed to have functionality similar to
on-demand iterators/iterables such as https://php.net/cachingiterator
(Additionally, immutability is rare among spl data structures)__set_state
was added
Hey Tyson,
I like the concept here. I think the naming choice is unfortunate, and
causing confusion for people.
What you're really proposing here is a data structure: A sequence of
key-value-pairs. That generally seems like a sensible thing to have, in
that we can implement it significantly more efficiently in core than you
could do it in userland, especially when it comes to memory usage.
The issue is that you're not really framing this as a data structure, but
as an iterable. I get that memoizing an iterable was the original
motivation here, but I think it causes confusion. If this were
KeyValueSequence::fromIterable($iterable), I think that the meaning and
behavior would be perfectly clear -- of course it would eagerly collect the
iterable, there is no other way it could reasonably work! I think Marco's
concerns wouldn't come up either -- it's perfectly reasonable for a data
structure to implement support for serialization and JSON encoding. Not so
much for an iterator.
Regards,
Nikita
Hi Nikita,
I like the concept here. I think the naming choice is unfortunate, and causing confusion for people.
What you're really proposing here is a data structure: A sequence of key-value-pairs. That generally seems like a sensible thing to have, in that we can implement it significantly more efficiently in core than you could do it in userland, especially when it comes to memory usage.
The issue is that you're not really framing this as a data structure, but as an iterable. I get that memoizing an iterable was the original motivation here, but I think it causes confusion. If this were KeyValueSequence::fromIterable($iterable), I think that the meaning and behavior would be perfectly clear -- of course it would eagerly collect the iterable, there is no other way it could reasonably work! I think Marco's concerns wouldn't come up either -- it's perfectly reasonable for a data structure to implement support for serialization and JSON encoding. Not so much for an iterator.
It may be more positive,
but I don't think it'd pass with any name or by changing the constructor to fromIterable.
Over half of the objections are to functionality, over half to unspecified reasons,
and other email discussion responses don't seem to indicate interest in having the functionality, just on clarifying implementation or naming details
I was trying to avoid proposing functionality similar to that already in php-ds or an improvement to spl
(especially with ongoing namespacing policy discussion),
but that seems to be a mistake - if it was chosen for inclusion in those modules then it'd be a very common use case
(e.g. https://www.php.net/manual/en/splqueue.construct, https://www.php.net/manual/en/class.ds-map.php, etc)
Cheers,
Tyson
Over half of the objections are to functionality, over half to unspecified reasons,
I support people choosing not to directly respond with their
unspecified reasons, but if anyone is open to sharing I would
appreciate them doing so. I agree with Nikita that the concept is
useful, and with some refinement it could be more acceptable to
voters. If we know more about these "unspecified reasons" then perhaps
we can improve something for them as well.
A guess: they don't really want unproven data structures going into
core, well, because you see the state of some (many) of the SPL data
structures.
Over half of the objections are to functionality, over half to unspecified reasons,
I support people choosing not to directly respond with their
unspecified reasons, but if anyone is open to sharing I would
appreciate them doing so. I agree with Nikita that the concept is
useful, and with some refinement it could be more acceptable to
voters. If we know more about these "unspecified reasons" then perhaps
we can improve something for them as well.A guess: they don't really want unproven data structures going into
core, well, because you see the state of some (many) of the SPL data
structures.
While I like the idea of an immutable collection, and the performance boost seems useful, this proposal seems to go about it in a sloppy way.
-
Iterable doesn't seem like the right "family" for this. It is iterable, but so are lots of other things.
-
I... have never seen anyone in PHP use "pairs" as a concept. I have no idea what they're doing here.
-
The JsonSerialize seems out of place. It may make sense from another angle, but it just sorta appears out of nowhere here.
It almost feels like what you actually want is an immutable Dictionary class. Such would naturally be iterable, countable, serializing makes some sense, a fromIterable() method would make sense, etc. That I could get behind, potentially, although it also runs into the exciting question of type restrictions and thus generics, which is where list type discussions go to die. :-)
So it's not the concept I'm against; the angle of attack here feels sloppy and coming at the problem from the wrong angle.
--Larry Garfield
Hi Larry Garfield,
Thanks for responding.
While I like the idea of an immutable collection, and the performance boost seems useful, this proposal seems to go about it in a sloppy way.
- Iterable doesn't seem like the right "family" for this. It is iterable, but so are lots of other things.
I'd suggested alternative names such as ImmutableKeyValueSequence, in https://externals.io/message/114834#114834 , but
- It seemed as if the sentiment was very strongly against long names. I likely misjudged this.
- Many names were suggested by only one person. I can't tell if there's a consensus from that. E.g.
*Aggregate
. - When I suggested names such as
ImmutableKeyValueSequence
before starting the vote, nobody had any feedback of it being better/worse than my previous proposals.
- I... have never seen anyone in PHP use "pairs" as a concept. I have no idea what they're doing here.
https://www.php.net/manual/en/class.ds-pair.php is a concept used in the DS PECL, e.g. https://www.php.net/manual/en/ds-map.first.php
Proposing that as a new object type seemed excessive here.
That reason is because PHP is (fairly) unique among languages with generic iterable types in that there's a key associated with values.
I had to deal with this unusual situation somehow, and it's not a surprise that the solution is also unusual.
Do you propose alternate solutions other than omitting the functionality?
- Javascript only provides values in .next() - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols
- Python only provides values https://docs.python.org/3/glossary.html#term-iterable
- C++ iterators only provide values https://www.cplusplus.com/reference/iterator/
- And so on
A generator-based workaround would be much slower
function userland create_iterable(iterable $pairs) {
foreach ($pairs as [$key, $value]) {
yield $key => $value;
}
}
// $pairs = array_map(...array_filter(...fetchOrComputeData(...)...)
$iterator = new ImmutableKeyValueSequence($pairs);
The other reason is that php has a large collection of internal and user-defined functions for dealing with arrays (sorting, filtering, etc), but few for iterables.
toPairs and fromPairs allow easily converting values to this and back, then calling usort/filter for compact code.
And if I provided fromPairs, toPairs seemed to make sense for completeness.
- The JsonSerialize seems out of place. It may make sense from another angle, but it just sorta appears out of nowhere here.
It almost feels like what you actually want is an immutable Dictionary class. Such would naturally be iterable, countable, serializing makes some sense, a fromIterable() method would make sense, etc.
It would be useful for some but not all use cases. Especially use cases where keys aren't hashable, or where keys are repeated.
Not all values would be hashable in a dictionary (e.g. circular data structures, self-referential arrays).
There's a lot of open design questions for Dictionary in core, e.g. the name, and whether objects should be hashable, or namespace, or whether it may conflict with future native types.
- And if a Hashable magic method or interface was added, then that might throw and make it impossible to store a generator.
- And if large data structures are used (e.g. yielding extremely large keys or slow object hashing, the hashing would be slow even when the application didn't need hashing at all)
That I could get behind, potentially, although it also runs into the exciting question of type restrictions and thus generics, which is where list type discussions go to die. :-)
That's another possible obstacle to dictionary in core, but I hope not.
Thanks,
Tyson