Declaration-aware attributes

2 years ago by Andreas Hennings — view source

unread

Hello internals,
I am picking up an idea that was mentioned by Benjamin Eberlei in the past.
https://externals.io/message/110217#110395
(we probably had the idea independently, but Benjamin's is the first
post where I see it mentioned in the list)

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.
Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

Both of these are somewhat limited and unpleasant.

I want to propose a new way to do this.
Get some feedback first, then maybe an RFC.

The idea is to mark constructor parameters of the attribute class with
a special parameter attribute, to receive the reflector.
The other arguments are then shifted to skip the "special" parameter.

#[Attribute]
class A {
public function __construct(
public readonly string $x,
#[AttributeContextClass]
public readonly \ReflectionClass $class,
public readonly string $y,
) {}
}

$a = (new ReflectionClass(C::class))->getAttributes()[0]->newInstance();
assert($a instanceof A);
assert($a->x === 'x');
assert($a->class->getName() === 'C');
assert($a->y === 'y');

Note that for methods, we typically need to know the method reflector
and the class reflector, because the method could be defined in a
base class.

#[Attribute]
class AA {
public function __construct(
#[AttributeContextClass]
public readonly \ReflectionClass $class,
#[AttributeContextMethod]
public readonly ReflectionMethod $method,
) {}
}

class B {
#[AA]
public function f(): void {}
}

class CC extends B {}

$aa = (new ReflectionMethod(CC::class, 'f))->getAttributes()[0]->newInstance();
assert($a->class->getName() === 'CC');
assert($a->method->getName() === 'f');

Notice that the original proposal by Benjamin would use an interface
and a setter method, ReflectorAwareAttribute::setReflector().

I prefer to use constructor parameters, because I generally prefer if
a constructor creates a complete and immutable object.

Thoughts?

-- Andreas

2 years ago by Andreas Hennings — view source

unread

A big TBD would be the new attribute classes we need to mark these parameters.
Perhaps we could use just one attribute class, e.g.
"AttributeDeclaration", and use the parameter type to determine which
part of the declaration is expected.

Also, if an attribute is allowed on different symbol types, then these
parameters could be made nullable, or allow multiple types.
Of course some combinations would be ambiguous, e.g.
\ReflectionClass|\ReflectionMethod.

#[Attribute(Attribute::TARGET_METHOD, Attribute::TARGET_PROPERTY,
Attribute::TARGET_CLASS_CONSTANT)]
class A {
public function __construct(
public readonly string $x,
#[AttributeDeclaration] public readonly \ReflectionClass $class,
#[AttributeDeclaration] public readonly ?\ReflectionMethod $method,
#[AttributeDeclaration] public readonly
?\ReflectionClassConstant|\ReflectionProperty $constantOrProperty,
public readonly string $y,
) {}
}

Hello internals,
I am picking up an idea that was mentioned by Benjamin Eberlei in the past.
https://externals.io/message/110217#110395
(we probably had the idea independently, but Benjamin's is the first
post where I see it mentioned in the list)

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.

Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

Both of these are somewhat limited and unpleasant.

I want to propose a new way to do this.
Get some feedback first, then maybe an RFC.

The idea is to mark constructor parameters of the attribute class with
a special parameter attribute, to receive the reflector.
The other arguments are then shifted to skip the "special" parameter.

#[Attribute]
class A {
public function __construct(
public readonly string $x,
#[AttributeContextClass]
public readonly \ReflectionClass $class,
public readonly string $y,
) {}
}

$a = (new ReflectionClass(C::class))->getAttributes()[0]->newInstance();
assert($a instanceof A);
assert($a->x === 'x');
assert($a->class->getName() === 'C');
assert($a->y === 'y');

Note that for methods, we typically need to know the method reflector
and the class reflector, because the method could be defined in a
base class.

#[Attribute]
class AA {
public function __construct(
#[AttributeContextClass]
public readonly \ReflectionClass $class,
#[AttributeContextMethod]
public readonly ReflectionMethod $method,
) {}
}

class B {
#[AA]
public function f(): void {}
}

class CC extends B {}

$aa = (new ReflectionMethod(CC::class, 'f))->getAttributes()[0]->newInstance();
assert($a->class->getName() === 'CC');
assert($a->method->getName() === 'f');

Notice that the original proposal by Benjamin would use an interface
and a setter method, ReflectorAwareAttribute::setReflector().

I prefer to use constructor parameters, because I generally prefer if
a constructor creates a complete and immutable object.

Thoughts?

-- Andreas

2 years ago by Dusk — view source

unread

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

+1. This is a substantial limitation in the attribute system.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.

Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

I see a third way which introduces less "magic":

3.a. Add a method to ReflectionAttribute which retrieves the target of the attribute as an appropriate reflection object. (Sadly, the obvious name "getTarget" is already taken; I'll call it "getReflectionTarget" for now.)

3.b. Add a static method to ReflectionAttribute which, when called within an Attribute constructor which is being called by ReflectionAttribute::newInstance(), returns the ReflectionAttribute object which is being instantiated.

These features could be used together to set a default property on an attribute based on its target, e.g.

#[Attribute(Attribute::TARGET_PROPERTY)]
class PropertyAnnotation {
  public string $name;

  public function __construct(?string $name = null) {
    $this->name = $name ?? ReflectionAttribute::underConstruction()->getReflectionTarget()->getName();
  }
}

Another variant that comes to mind is:

3.b. Add a new flag to attributes which causes the ReflectionAttribute object to be passed to the constructor as the first argument, e.g.

#[Attribute(Attribute::TARGET_PROPERTY | Attribute::WHO_MADE_ME)]
class PropertyAnnotation {
  public string $name;

  public function __construct(ReflectionAttribute $attr, ?string $name = null) {
    $this->name = $name ?? $attr->getReflectionTarget()->getName();
  }
}

This is a little messier because it can't be used "under the covers" by an attribute base class, but it accomplishes the same goals.

2 years ago by Andreas Hennings — view source

unread

Thanks for the feedback!

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

+1. This is a substantial limitation in the attribute system.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.

Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

I see a third way which introduces less "magic":

Actually, these two options are what is possible with the current
version of PHP, but it requires to write php code to solve these
problems each time.
The actual proposal in terms of "language design" solution was further
below in my email :)

3.a. Add a method to ReflectionAttribute which retrieves the target of the attribute as an appropriate reflection object. (Sadly, the obvious name "getTarget" is already taken; I'll call it "getReflectionTarget" for now.)

3.b. Add a static method to ReflectionAttribute which, when called within an Attribute constructor which is being called by ReflectionAttribute::newInstance(), returns the ReflectionAttribute object which is being instantiated.

These features could be used together to set a default property on an attribute based on its target, e.g.
#[Attribute(Attribute::TARGET_PROPERTY)]
class PropertyAnnotation {
  public string $name;

  public function __construct(?string $name = null) {
    $this->name = $name ?? ReflectionAttribute::underConstruction()->getReflectionTarget()->getName();
  }
}
Another variant that comes to mind is:

3.b. Add a new flag to attributes which causes the ReflectionAttribute object to be passed to the constructor as the first argument, e.g.
#[Attribute(Attribute::TARGET_PROPERTY | Attribute::WHO_MADE_ME)]
class PropertyAnnotation {
  public string $name;

  public function __construct(ReflectionAttribute $attr, ?string $name = null) {
    $this->name = $name ?? $attr->getReflectionTarget()->getName();
  }
}
This is a little messier because it can't be used "under the covers" by an attribute base class, but it accomplishes the same goals.

This is close to the parameter attribute I proposed.
The benefit of the parameter attribute is that a developer can clearly
see which parameter will be skipped for regular arguments.

2 years ago by Andreas Hennings — view source

unread

I just notice a flaw in my thinking.

Note that for methods, we typically need to know the method reflector
and the class reflector, because the method could be defined in a
base class.

This is true when doing a discovery using attributes.
However, PHP does not know the original class when we call (new
ReflectionClass('C'))->getMethod('f')->getAttributes().
So it cannot pass the original class into the attribute constructor.

Currently, only the userland code that does the discovery knows the
original class.

We would need a type of method reflector that keeps track of the original class.
This could well be its own separate RFC.
Once this is done, we no longer need to pass the class as a separate argument.

So to keep this RFC independent, we should only pass the reflector
object where ->getAttributes()[*]->newInstance() was called on.
Then if in a separate RFC we keep track of the original class in
ReflectionMethod, the same information will be available when the
method reflector is injected into an attributes.

This also means that the ReflectionAttribute object needs to keep a
reference of the original reflector.
And if it does that, we can as well provide a method to return it.

assert((new ReflectionClass('C'))->getAttributes()[0]->getReflector()->getName()
=== 'C');

2 years ago by Stephen Reay — view source

unread

Hello internals,
I am picking up an idea that was mentioned by Benjamin Eberlei in the past.
https://externals.io/message/110217#110395
(we probably had the idea independently, but Benjamin's is the first
post where I see it mentioned in the list)

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.

Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

Both of these are somewhat limited and unpleasant.

I want to propose a new way to do this.
Get some feedback first, then maybe an RFC.

The idea is to mark constructor parameters of the attribute class with
a special parameter attribute, to receive the reflector.
The other arguments are then shifted to skip the "special" parameter.

#[Attribute]
class A {
public function __construct(
public readonly string $x,
#[AttributeContextClass]
public readonly \ReflectionClass $class,
public readonly string $y,
) {}
}

$a = (new ReflectionClass(C::class))->getAttributes()[0]->newInstance();
assert($a instanceof A);
assert($a->x === 'x');
assert($a->class->getName() === 'C');
assert($a->y === 'y');

Note that for methods, we typically need to know the method reflector
and the class reflector, because the method could be defined in a
base class.

#[Attribute]
class AA {
public function __construct(
#[AttributeContextClass]
public readonly \ReflectionClass $class,
#[AttributeContextMethod]
public readonly ReflectionMethod $method,
) {}
}

class B {
#[AA]
public function f(): void {}
}

class CC extends B {}

$aa = (new ReflectionMethod(CC::class, 'f))->getAttributes()[0]->newInstance();
assert($a->class->getName() === 'CC');
assert($a->method->getName() === 'f');

Notice that the original proposal by Benjamin would use an interface
and a setter method, ReflectorAwareAttribute::setReflector().

I prefer to use constructor parameters, because I generally prefer if
a constructor creates a complete and immutable object.

Thoughts?

-- Andreas

--

To unsubscribe, visit: https://www.php.net/unsub.php

Hi Andreas,

I too have wondered (and I think asked in room11?) about such a concept. From memory the general response was “just do it in userland with a wrapper” so its good to see someone else is interested in this being part of the language.

While I agree that it’s most useful if the Reflector instance is available in the constructor, I’m not keen on the proposed magic “skipping” of arguments as you suggest. It seems way too easy to confuse someone (particularly if the attribute class itself has reason to be instantiated directly in code)

I think a better approach would be to suggest authors put the parameter at the end of the parameter list, so that no ‘skipping' is required when passing arguments without names (or put it where you like if you’re always using named arguments)

Cheers

Stephen

2 years ago by Andreas Hennings — view source

unread

Hello internals,
I am picking up an idea that was mentioned by Benjamin Eberlei in the past.
https://externals.io/message/110217#110395
(we probably had the idea independently, but Benjamin's is the first
post where I see it mentioned in the list)

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.

Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

Both of these are somewhat limited and unpleasant.

I want to propose a new way to do this.
Get some feedback first, then maybe an RFC.

The idea is to mark constructor parameters of the attribute class with
a special parameter attribute, to receive the reflector.
The other arguments are then shifted to skip the "special" parameter.

#[Attribute]
class A {
public function __construct(
public readonly string $x,
#[AttributeContextClass]
public readonly \ReflectionClass $class,
public readonly string $y,
) {}
}

$a = (new ReflectionClass(C::class))->getAttributes()[0]->newInstance();
assert($a instanceof A);
assert($a->x === 'x');
assert($a->class->getName() === 'C');
assert($a->y === 'y');

Note that for methods, we typically need to know the method reflector
and the class reflector, because the method could be defined in a
base class.

#[Attribute]
class AA {
public function __construct(
#[AttributeContextClass]
public readonly \ReflectionClass $class,
#[AttributeContextMethod]
public readonly ReflectionMethod $method,
) {}
}

class B {
#[AA]
public function f(): void {}
}

class CC extends B {}

$aa = (new ReflectionMethod(CC::class, 'f))->getAttributes()[0]->newInstance();
assert($a->class->getName() === 'CC');
assert($a->method->getName() === 'f');

Notice that the original proposal by Benjamin would use an interface
and a setter method, ReflectorAwareAttribute::setReflector().

I prefer to use constructor parameters, because I generally prefer if
a constructor creates a complete and immutable object.

Thoughts?

-- Andreas

--

To unsubscribe, visit: https://www.php.net/unsub.php

Hi Andreas,

I too have wondered (and I think asked in room11?) about such a concept. From memory the general response was “just do it in userland with a wrapper” so its good to see someone else is interested in this being part of the language.

While I agree that it’s most useful if the Reflector instance is available in the constructor, I’m not keen on the proposed magic “skipping” of arguments as you suggest. It seems way too easy to confuse someone (particularly if the attribute class itself has reason to be instantiated directly in code)

Good point! Almost made me change my mind completely. But I already
changed it back :)

When instantiating in code, the "real" signature would have to be
used, and the reflector argument passed explicitly. This would be
useful for unit tests that want to replicate the realistic behavior.
Also it guarantees that the code of the attribute class can really
count on this value to not be null, no matter how the class is
instantiated.

I think a better approach would be to suggest authors put the parameter at the end of the parameter list, so that no ‘skipping' is required when passing arguments without names (or put it where you like if you’re always using named arguments)

If I understand correctly, the proposal would technically not change,
we just add a recommendation.

The only problem I see is with variadic parameters, which naturally
need to be last.

This brings me to another idea: What if instead of the parameter
attribute we use a pseudo constant default value?
Or some other kind of expression that is only valid in the right context.

#[Attribute]
class A {
function __construct(
..
ReflectionClass $class = REFLECTOR,
..
) {..

The "constant" would be available only as a default value, not inside
the constructor body.

But I already don't like this idea, because
ReflectionParameter->getDefaultValue() would have to throw an
exception, or ->isOptional would have to return false.

Better to have some kind of placeholder in the attribute declaration,
if we dont like the skipping.

#[A('x', ?, 'y')]
or
#[A('x', REFLECTOR, 'y')]

The '?' would be the same as proposed for first class callable partial
function application.

This could be interpreted as that we create a callable with only one
free parameter, which php then invokes with the reflector.

But even the skipping without a placeholder can be justified, because
an attribute declaration is already not regular code, e.g. we don't
write "new ".

Cheers

Stephen

2 years ago by Stephen Reay — view source

unread

(Resending to the list without all the history because qmail complained about message size)

Hi Andreas,

I too have wondered (and I think asked in room11?) about such a concept. >From memory the general response was “just do it in userland with a wrapper” so its good to see someone else is interested in this being part of the language.

While I agree that it’s most useful if the Reflector instance is available in the constructor, I’m not keen on the proposed magic “skipping” of arguments as you suggest. It seems way too easy to confuse someone (particularly if the attribute class itself has reason to be instantiated directly in code)

Good point! Almost made me change my mind completely. But I already
changed it back :)

When instantiating in code, the "real" signature would have to be
used, and the reflector argument passed explicitly.

That’s kind of my point: it’s not super intuitive why (or the specifics of how) it’s being skipped when it’s an attribute, vs when it’s instantiated from code. What if someone specifies an argument with the same name? If they specify args without names, can they just use null for that? Etc.

This would be
useful for unit tests that want to replicate the realistic behavior.
Also it guarantees that the code of the attribute class can really
count on this value to not be null, no matter how the class is
instantiated.

I would expect that whether the Reflector object is required is simply a matter of whether or not the parameter is nullable.

If it’s not nullable, then yes, the explicit instantiation call will need to supply it at the correct location. If it’s only required when created from attribute usage, then it would accept null, and the constructor would have appropriate logic to handle that.

I think a better approach would be to suggest authors put the parameter at the end of the parameter list, so that no ‘skipping' is required when passing arguments without names (or put it where you like if you’re always using named arguments)

If I understand correctly, the proposal would technically not change,
we just add a recommendation.

Technically, yes “my way” would work fine with the proposal you’ve suggested, if I choose to always put the parameter marked by #[ReflectionContext] last.

I’m just concerned about confusing usage if “insert this parameter anywhere” is the ‘recommended’ (i.e. documented example) way to use this feature.

Even with that concern, I still prefer this to most other solutions mentioned so far, for the same reasons: they’re all some degree of magic.

The only other solution I can think of that’s less “magic” and more explicit, is (and I have no idea if this is even feasible technically) to introduce a builtin trait for attribute classes to use, providing a protected method or property that gives access to the Reflector (how the trait has access is not really important, I assume it can be assigned to the object somehow before the constructor is called). I guess this could also be an abstract class, but a trait makes it much easier to adopt so that would be my preferred approach.

So something like

trait AttributeReflector {
protected function getReflector(): \Reflector {
// do internal stuff
}
}

#[Attribute]
class Foo {
Use \AttributeReflector;

public readonly string $name;

function __construct(?string $name = null) {
    $this->name = $name ?? $this->getReflector()->name;
}

}

2 years ago by Andreas Hennings — view source

unread

(Resending to the list without all the history because qmail complained about message size)

Hi Andreas,

I too have wondered (and I think asked in room11?) about such a concept. >From memory the general response was “just do it in userland with a wrapper” so its good to see someone else is interested in this being part of the language.

While I agree that it’s most useful if the Reflector instance is available in the constructor, I’m not keen on the proposed magic “skipping” of arguments as you suggest. It seems way too easy to confuse someone (particularly if the attribute class itself has reason to be instantiated directly in code)

Good point! Almost made me change my mind completely. But I already
changed it back :)

When instantiating in code, the "real" signature would have to be
used, and the reflector argument passed explicitly.

That’s kind of my point: it’s not super intuitive why (or the specifics of how) it’s being skipped when it’s an attribute, vs when it’s instantiated from code. What if someone specifies an argument with the same name? If they specify args without names, can they just use null for that? Etc.

I agree it could be confusing.
But for the named args, I think it is quite obvious:

#[Attribute]
class A {
public readonly array $moreArgs;
public function __construct(
public readonly string $x,
// Reflector parameter can be last required parameter, BUT
#[AttributeDeclaration] public readonly \ReflectionClass $class,
// Optional parameters have to be after the required reflector parameter.
public readonly ?string $y = NULL,
// Variadic parameter must be last.
string ...$moreArgs,
) {
$this->moreArgs = $moreArgs;
}
}

#[A('x', 'y', 'z')] // -> new A('x', $reflector, 'y', 'z')
#[A(x: 'x', y: 'y')] // -> new A('x', $reflector, 'y')
#[A(x: 'x', class: new ReflectionClass('C'))] // -> new A('x', new
ReflectionClass('C'))

We could say that explicitly passing a value for the reflector in an
attribute declaration is forbidden.
Or we allow it, then an explicit value would simply overwrite the
implicit value.

If we use placeholder syntax, the above examples would look like this:

#[A('x', ?, 'y', 'z')] // -> new A('x', $reflector, 'y', 'z')
#[A(x: 'x', class: ?, y: 'y')] // -> new A('x', $reflector, 'y')
#[A(x: 'x', class: new ReflectionClass('C'))] // -> new A('x', new
ReflectionClass('C'))

This would be
useful for unit tests that want to replicate the realistic behavior.
Also it guarantees that the code of the attribute class can really
count on this value to not be null, no matter how the class is
instantiated.

I would expect that whether the Reflector object is required is simply a matter of whether or not the parameter is nullable.
If it’s not nullable, then yes, the explicit instantiation call will need to supply it at the correct location. If it’s only required when created from attribute usage, then it would accept null, and the constructor would have appropriate logic to handle that.

Yes.
But I would expect the common practice to be to make it required,
because then the constructor code will be simpler.

I think a better approach would be to suggest authors put the parameter at the end of the parameter list, so that no ‘skipping' is required when passing arguments without names (or put it where you like if you’re always using named arguments)

If I understand correctly, the proposal would technically not change,
we just add a recommendation.

Technically, yes “my way” would work fine with the proposal you’ve suggested, if I choose to always put the parameter marked by #[ReflectionContext] last.

As above, the problem with this would be optional and variadic
parameters, which have to come after a required reflector parameter.

I’m just concerned about confusing usage if “insert this parameter anywhere” is the ‘recommended’ (i.e. documented example) way to use this feature.

Even with that concern, I still prefer this to most other solutions mentioned so far, for the same reasons: they’re all some degree of magic.

The only other solution I can think of that’s less “magic” and more explicit, is (and I have no idea if this is even feasible technically) to introduce a builtin trait for attribute classes to use, providing a protected method or property that gives access to the Reflector (how the trait has access is not really important, I assume it can be assigned to the object somehow before the constructor is called). I guess this could also be an abstract class, but a trait makes it much easier to adopt so that would be my preferred approach.

So something like

trait AttributeReflector {
protected function getReflector(): \Reflector {
// do internal stuff
}
}

#[Attribute]
class Foo {
Use \AttributeReflector;
public readonly string $name;

function __construct(?string $name = null) {
    $this->name = $name ?? $this->getReflector()->name;
}
}

I was also considering this, but there is a problem.
When you instantiate the attribute class outside an attribute
declaration, how do you tell it about a required reflector?
This would be relevant e.g. during unit tests.

The constructor parameter provides a clean way to pass a custom reflector.
But with the ->getReflector(), the reflector would already be
magically added to the object before the constructor is executed. This
would be impossible to replicate in custom code outside an attribute
declaration.

Cheers
Andreas

--

To unsubscribe, visit: https://www.php.net/unsub.php

2 years ago by Larry Garfield — view source

unread

(Resending to the list without all the history because qmail complained about message size)

Hi Andreas,

I too have wondered (and I think asked in room11?) about such a concept. >From memory the general response was “just do it in userland with a wrapper” so its good to see someone else is interested in this being part of the language.

While I agree that it’s most useful if the Reflector instance is available in the constructor, I’m not keen on the proposed magic “skipping” of arguments as you suggest. It seems way too easy to confuse someone (particularly if the attribute class itself has reason to be instantiated directly in code)

Good point! Almost made me change my mind completely. But I already
changed it back :)

When instantiating in code, the "real" signature would have to be
used, and the reflector argument passed explicitly.

That’s kind of my point: it’s not super intuitive why (or the specifics of how) it’s being skipped when it’s an attribute, vs when it’s instantiated from code. What if someone specifies an argument with the same name? If they specify args without names, can they just use null for that? Etc.

I agree it could be confusing.
But for the named args, I think it is quite obvious:

#[Attribute]
class A {
public readonly array $moreArgs;
public function __construct(
public readonly string $x,
// Reflector parameter can be last required parameter, BUT
#[AttributeDeclaration] public readonly \ReflectionClass $class,
// Optional parameters have to be after the required reflector parameter.
public readonly ?string $y = NULL,
// Variadic parameter must be last.
string ...$moreArgs,
) {
$this->moreArgs = $moreArgs;
}
}

#[A('x', 'y', 'z')] // -> new A('x', $reflector, 'y', 'z')
#[A(x: 'x', y: 'y')] // -> new A('x', $reflector, 'y')
#[A(x: 'x', class: new ReflectionClass('C'))] // -> new A('x', new
ReflectionClass('C'))

We could say that explicitly passing a value for the reflector in an
attribute declaration is forbidden.
Or we allow it, then an explicit value would simply overwrite the
implicit value.

If we use placeholder syntax, the above examples would look like this:

#[A('x', ?, 'y', 'z')] // -> new A('x', $reflector, 'y', 'z')
#[A(x: 'x', class: ?, y: 'y')] // -> new A('x', $reflector, 'y')
#[A(x: 'x', class: new ReflectionClass('C'))] // -> new A('x', new
ReflectionClass('C'))

This would be
useful for unit tests that want to replicate the realistic behavior.
Also it guarantees that the code of the attribute class can really
count on this value to not be null, no matter how the class is
instantiated.

I would expect that whether the Reflector object is required is simply a matter of whether or not the parameter is nullable.
If it’s not nullable, then yes, the explicit instantiation call will need to supply it at the correct location. If it’s only required when created from attribute usage, then it would accept null, and the constructor would have appropriate logic to handle that.

Yes.
But I would expect the common practice to be to make it required,
because then the constructor code will be simpler.

I think a better approach would be to suggest authors put the parameter at the end of the parameter list, so that no ‘skipping' is required when passing arguments without names (or put it where you like if you’re always using named arguments)

If I understand correctly, the proposal would technically not change,
we just add a recommendation.

Technically, yes “my way” would work fine with the proposal you’ve suggested, if I choose to always put the parameter marked by #[ReflectionContext] last.

As above, the problem with this would be optional and variadic
parameters, which have to come after a required reflector parameter.
I’m just concerned about confusing usage if “insert this parameter anywhere” is the ‘recommended’ (i.e. documented example) way to use this feature.

Even with that concern, I still prefer this to most other solutions mentioned so far, for the same reasons: they’re all some degree of magic.

The only other solution I can think of that’s less “magic” and more explicit, is (and I have no idea if this is even feasible technically) to introduce a builtin trait for attribute classes to use, providing a protected method or property that gives access to the Reflector (how the trait has access is not really important, I assume it can be assigned to the object somehow before the constructor is called). I guess this could also be an abstract class, but a trait makes it much easier to adopt so that would be my preferred approach.

So something like

trait AttributeReflector {
protected function getReflector(): \Reflector {
// do internal stuff
}
}

#[Attribute]
class Foo {
Use \AttributeReflector;
public readonly string $name;

function __construct(?string $name = null) {
    $this->name = $name ?? $this->getReflector()->name;
}
}
I was also considering this, but there is a problem.
When you instantiate the attribute class outside an attribute
declaration, how do you tell it about a required reflector?
This would be relevant e.g. during unit tests.

The constructor parameter provides a clean way to pass a custom reflector.
But with the ->getReflector(), the reflector would already be
magically added to the object before the constructor is executed. This
would be impossible to replicate in custom code outside an attribute
declaration.

I've run into this issue in my attribute library as well (https://github.com/Crell/AttributeUtils). What I do there is allow attributes to opt-in to a callback method via interface. For example:

#[\Attribute]
class AttribWithName implements FromReflectionClass
{
public readonly string $name;

public function __construct(?string $name = null) 
{
    if ($name) {
        $this->name = $name;
    }
}

public function fromReflection(\ReflectionClass $subject): void
{
    $this->name ??= $subject->getShortName();
}

}

(Side note: This is why static analysis tools that forbid writing to readonly properties except from the constructor are wrong; it's also an example of where asymmetric visibility would be superior to readonly. But I digress.)

My preference would be for something along those lines to be implemented in core.

Importantly, we MUST NOT design it such that the reflection object gets assigned to a property of the attribute. Reflection objects are not serializable. Attributes will frequently be cached. That means it forces the attribute author to make the property nullable AND then unset it sometime before the attribute gets serialized, or it will break. That's a no-go.

That's why I think an opt-in interface is the way to go. It also avoids any confusion around the constructor parameters, which is, based on this thread, a confusing area. :-)

My second preference would be the ReflectionAttribute::inProgress() call in the constructor, or something like that. I like that less because it's a stateful call, but it would also reduce the issue with readonly properties (as in the example above) by making both alternatives available in the constructor, so maybe it's an acceptable tradeoff.

I can see an argument either direction here.

--Larry Garfield

2 years ago by Andreas Hennings — view source

unread

I've run into this issue in my attribute library as well (https://github.com/Crell/AttributeUtils). What I do there is allow attributes to opt-in to a callback method via interface. For example:

#[\Attribute]
class AttribWithName implements FromReflectionClass
{
public readonly string $name;
public function __construct(?string $name = null)
{
    if ($name) {
        $this->name = $name;
    }
}

public function fromReflection(\ReflectionClass $subject): void
{
    $this->name ??= $subject->getShortName();
}
}

So it is technically from outside it is a setter, whereas from inside
it is not really.
Technically, this means you need a version of the attribute object
that can exist without the values from the reflector.
The constructor needs to initialize some "stub" values, until the
setter is called.
Every other method also needs to support the case when the setter has
not been called yet, or possibly throw an exception.
Also, your property type has to allow null, so ?string, not string.

I usually try to avoid this, this is why I proposed the constructor parameter.
This way, the object is never in an incomplete state.

(Side note: Personally I use the naming convention from*() for static
factory methods. I might use set*() for this one, but then again, it
is only a setter from the outside.)

(Side note: This is why static analysis tools that forbid writing to readonly properties except from the constructor are wrong; it's also an example of where asymmetric visibility would be superior to readonly. But I digress.)

Technically there is no guarantee that the setters will be called
before any other method, and only once.
If these methods can write on readonly properties, then any method can.
Unless we somehow mark these methods as special.

On the other hand, a wither method with "clone with" should be allowed
to work on readonly properties.
You could rewrite your method like this, once we have clone with:
(or use the old-school syntax but without readonly)

public function withReflector(\ReflectionClass $subject): static
{
    return ($this->name !== NULL)
        ? $this
        : clone $this with (name: $subject->getShortName();
}

Then in the discovery code:

$attribute = $reflectionClass->getAttributes()[0]->newInstance();
$attribute = $attribute->withReflector($reflectionClass);

My preference would be for something along those lines to be implemented in core.

Importantly, we MUST NOT design it such that the reflection object gets assigned to a property of the attribute. Reflection objects are not serializable. Attributes will frequently be cached. That means it forces the attribute author to make the property nullable AND then unset it sometime before the attribute gets serialized, or it will break. That's a no-go.

There could be ways around this problem, but I agree we should avoid
it on design level.

That's why I think an opt-in interface is the way to go. It also avoids any confusion around the constructor parameters, which is, based on this thread, a confusing area. :-)

What do you think about a placeholder syntax to avoid confusion with a
skipped parameter?
Like

#[A('x', ?', 'y')]

My second preference would be the ReflectionAttribute::inProgress() call in the constructor, or something like that. I like that less because it's a stateful call, but it would also reduce the issue with readonly properties (as in the example above) by making both alternatives available in the constructor, so maybe it's an acceptable tradeoff.

I would like to avoid anything that is stateful or that leaves an
incomplete stub object.
(But I think I made this clear enough, so..)

I can see an argument either direction here.

--Larry Garfield

--

To unsubscribe, visit: https://www.php.net/unsub.php

2 years ago by Larry Garfield — view source

unread

I've run into this issue in my attribute library as well (https://github.com/Crell/AttributeUtils). What I do there is allow attributes to opt-in to a callback method via interface. For example:

#[\Attribute]
class AttribWithName implements FromReflectionClass
{
public readonly string $name;
public function __construct(?string $name = null)
{
    if ($name) {
        $this->name = $name;
    }
}

public function fromReflection(\ReflectionClass $subject): void
{
    $this->name ??= $subject->getShortName();
}
}
So it is technically from outside it is a setter, whereas from inside
it is not really.
Technically, this means you need a version of the attribute object
that can exist without the values from the reflector.
The constructor needs to initialize some "stub" values, until the
setter is called.
Every other method also needs to support the case when the setter has
not been called yet, or possibly throw an exception.
Also, your property type has to allow null, so ?string, not string.

Except the property type is not null above? The argument is, but not the property. (I could use CPP here with a null value instead, if we had asymmetric visibility.)

Also, if the interface is called by the reflection system itself as part of getInstance() then yes, we can guarantee that it's been run, just like we can guarantee the constructor has run.

(Side note: This is why static analysis tools that forbid writing to readonly properties except from the constructor are wrong; it's also an example of where asymmetric visibility would be superior to readonly. But I digress.)

Technically there is no guarantee that the setters will be called
before any other method, and only once.
If these methods can write on readonly properties, then any method can.
Unless we somehow mark these methods as special.

There's nothing special about readonly properties there. An uninitialized non-readonly property is no more or less susceptible to still being uninitialized when you need it. Singling out readonly here is just dumb, and exacerbates the problems of readonly.

On the other hand, a wither method with "clone with" should be allowed
to work on readonly properties.
You could rewrite your method like this, once we have clone with:
(or use the old-school syntax but without readonly)
public function withReflector(\ReflectionClass $subject): static
{
    return ($this->name !== NULL)
        ? $this
        : clone $this with (name: $subject->getShortName();
}
Then in the discovery code:
$attribute = $reflectionClass->getAttributes()[0]->newInstance();
$attribute = $attribute->withReflector($reflectionClass);

In that library I actually have several opt-in post-constructor setup routines. The documentation covers them all. Making them all withers would be just needlessly slow. Basically it's an example of a "mutable and then locked" object, which I emulate by virtue of only calling the setup methods from the library, and using readonly properties.

That's why I think an opt-in interface is the way to go. It also avoids any confusion around the constructor parameters, which is, based on this thread, a confusing area. :-)

What do you think about a placeholder syntax to avoid confusion with a
skipped parameter?
Like

#[A('x', ?', 'y')]

Oh god no. For one, function calls are already complicated enough and the next thing we should add there is some kind of partial application, which already discussed using ?. Second, that the attribute cares about what it is attached to is not something the caller should care about. If the user of the attribute needs to specify "and put the reflection object here" in the attribute usage, we've failed. I'd vote against any such proposal in any form, whatever the syntax.

My second preference would be the ReflectionAttribute::inProgress() call in the constructor, or something like that. I like that less because it's a stateful call, but it would also reduce the issue with readonly properties (as in the example above) by making both alternatives available in the constructor, so maybe it's an acceptable tradeoff.

I would like to avoid anything that is stateful or that leaves an
incomplete stub object.
(But I think I made this clear enough, so..)

I don't think it's going to be possible to have both. We either have a static call of some kind (the inProgress() method above or similar) that only works sometimes (ie, while an attribute is being constructed), or we have multiple setup methods (which the engine can ensure are called, but technically that does mean the object has a moment where it's potentially incomplete). I doubt we can have our cake and eat it too, here.

Personally I think the "thou shalt not ever have an object that isn't perfectly setup" rule is over-blown. It's a good guideline, but it has edge cases where it just simply doesn't work. This is one of them.

--Larry Garfield

2 years ago by Andreas Hennings — view source

unread

I've run into this issue in my attribute library as well (https://github.com/Crell/AttributeUtils). What I do there is allow attributes to opt-in to a callback method via interface. For example:

#[\Attribute]
class AttribWithName implements FromReflectionClass
{
public readonly string $name;
public function __construct(?string $name = null)
{
    if ($name) {
        $this->name = $name;
    }
}

public function fromReflection(\ReflectionClass $subject): void
{
    $this->name ??= $subject->getShortName();
}
}
So it is technically from outside it is a setter, whereas from inside
it is not really.
Technically, this means you need a version of the attribute object
that can exist without the values from the reflector.
The constructor needs to initialize some "stub" values, until the
setter is called.
Every other method also needs to support the case when the setter has
not been called yet, or possibly throw an exception.
Also, your property type has to allow null, so ?string, not string.
Except the property type is not null above? The argument is, but not the property. (I could use CPP here with a null value instead, if we had asymmetric visibility.)

True!
My conception of readonly was distorted by the same static analysis
tools that you complained about earlier!
https://3v4l.org/CUgYv
I think a more accurate label would be "write once".

Also, if the interface is called by the reflection system itself as part of getInstance() then yes, we can guarantee that it's been run, just like we can guarantee the constructor has run.

The concern was that you can also construct the attribute object with
a regular new AttribWithName() expression, and then you can omit the
->fromReflection() call.
But we could say it is part of the contract that you have to call the
method before the object is fully operational.

(Side note: This is why static analysis tools that forbid writing to readonly properties except from the constructor are wrong; it's also an example of where asymmetric visibility would be superior to readonly. But I digress.)

Technically there is no guarantee that the setters will be called
before any other method, and only once.
If these methods can write on readonly properties, then any method can.
Unless we somehow mark these methods as special.

There's nothing special about readonly properties there. An uninitialized non-readonly property is no more or less susceptible to still being uninitialized when you need it. Singling out readonly here is just dumb, and exacerbates the problems of readonly.
On the other hand, a wither method with "clone with" should be allowed
to work on readonly properties.
You could rewrite your method like this, once we have clone with:
(or use the old-school syntax but without readonly)
public function withReflector(\ReflectionClass $subject): static
{
    return ($this->name !== NULL)
        ? $this
        : clone $this with (name: $subject->getShortName();
}
Then in the discovery code:
$attribute = $reflectionClass->getAttributes()[0]->newInstance();
$attribute = $attribute->withReflector($reflectionClass);
In that library I actually have several opt-in post-constructor setup routines. The documentation covers them all. Making them all withers would be just needlessly slow. Basically it's an example of a "mutable and then locked" object, which I emulate by virtue of only calling the setup methods from the library, and using readonly properties.

That's why I think an opt-in interface is the way to go. It also avoids any confusion around the constructor parameters, which is, based on this thread, a confusing area. :-)

What do you think about a placeholder syntax to avoid confusion with a
skipped parameter?
Like

#[A('x', ?', 'y')]

Oh god no. For one, function calls are already complicated enough and the next thing we should add there is some kind of partial application, which already discussed using ?. Second, that the attribute cares about what it is attached to is not something the caller should care about. If the user of the attribute needs to specify "and put the reflection object here" in the attribute usage, we've failed. I'd vote against any such proposal in any form, whatever the syntax.

I was actually thinking of the partial application with the ? symbol.
The metaphor would be that we create a partial callable from the new A(.., ?) expression, and then PHP calls that callable with the
reflector argument.
It would all just be a metaphor to justify the ? placeholder,
because PHP would actually just skip the extra step and evaluate the
expression in one go.

But the only reason was to have something to avoid a skipped parameter.
We could also put other placeholders that feel more like expressions,
but I am not happy with that either.

In the end I could also live with a method call as you propose.

My second preference would be the ReflectionAttribute::inProgress() call in the constructor, or something like that. I like that less because it's a stateful call, but it would also reduce the issue with readonly properties (as in the example above) by making both alternatives available in the constructor, so maybe it's an acceptable tradeoff.

I would like to avoid anything that is stateful or that leaves an
incomplete stub object.
(But I think I made this clear enough, so..)

I don't think it's going to be possible to have both. We either have a static call of some kind (the inProgress() method above or similar) that only works sometimes (ie, while an attribute is being constructed), or we have multiple setup methods (which the engine can ensure are called, but technically that does mean the object has a moment where it's potentially incomplete). I doubt we can have our cake and eat it too, here.

Personally I think the "thou shalt not ever have an object that isn't perfectly setup" rule is over-blown. It's a good guideline, but it has edge cases where it just simply doesn't work. This is one of them.

--Larry Garfield

--

To unsubscribe, visit: https://www.php.net/unsub.php

2 years ago by kontakt@beberlei.de — view source

unread

On Tue, May 30, 2023 at 2:49 AM Andreas Hennings andreas@dqxtech.net
wrote:

Hello internals,
I am picking up an idea that was mentioned by Benjamin Eberlei in the past.
https://externals.io/message/110217#110395
(we probably had the idea independently, but Benjamin's is the first
post where I see it mentioned in the list)

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.

Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

Both of these are somewhat limited and unpleasant.

I want to propose a new way to do this.
Get some feedback first, then maybe an RFC.

The idea is to mark constructor parameters of the attribute class with
a special parameter attribute, to receive the reflector.
The other arguments are then shifted to skip the "special" parameter.

#[Attribute]
class A {
public function __construct(
public readonly string $x,
#[AttributeContextClass]
public readonly \ReflectionClass $class,
public readonly string $y,
) {}
}

$a = (new ReflectionClass(C::class))->getAttributes()[0]->newInstance();
assert($a instanceof A);
assert($a->x === 'x');
assert($a->class->getName() === 'C');
assert($a->y === 'y');

Note that for methods, we typically need to know the method reflector
and the class reflector, because the method could be defined in a
base class.

#[Attribute]
class AA {
public function __construct(
#[AttributeContextClass]
public readonly \ReflectionClass $class,
#[AttributeContextMethod]
public readonly ReflectionMethod $method,
) {}
}

class B {
#[AA]
public function f(): void {}
}

class CC extends B {}

$aa = (new ReflectionMethod(CC::class,
'f))->getAttributes()[0]->newInstance();
assert($a->class->getName() === 'CC');
assert($a->method->getName() === 'f');

Notice that the original proposal by Benjamin would use an interface
and a setter method, ReflectorAwareAttribute::setReflector().

I prefer to use constructor parameters, because I generally prefer if
a constructor creates a complete and immutable object.

Thank you bringing this up, the more I work with attributes the more often
this comes up. I think when we designed the attributes there was just so
little concrete exprimentation that we didn't pick this up as a serious
missing gap.

As for implementation, reviewing the whole e-mail thread, i like both:

1, ReflectionAttribute::getReflectionTarget() - this we should add no
matter what and is a no brainer
2. An argument attribute that instructs newInstance() to inject the
reflector or the ReflectionAttribute, for example #[AttributeContext] or
#[AttributeTargetReflector]

Thoughts?

-- Andreas

--

To unsubscribe, visit: https://www.php.net/unsub.php

2 months ago by Andreas Hennings — view source

unread

To follow up on this idea.

A less problematic solution would be to have a global function or
static method similar to func_get_args(), that can only be called from
within the constructor of an attribute.
A function would be the most natural. But if we are afraid of name
clashes with userland functions, we could introduce
\Attribute::getReflector().

#[\Attribute]
class MyAttribute {
public \Reflector $reflector;
public function __construct() {
$this->reflector = \Attribute::getReflector();
}
}

#[MyAttribute]
class C {}

$reflection_class = new ReflectionClass(C::class);
$reflection_attribute = $reflector->getAttributes()[0];
$attribute = ->newInstance();
assert($reflection_class === $attribute->reflector);

We could go a bit further with this:

We could have one method to get the ReflectionAttribute, and another
to get the reflector of the symbol the attribute is attached to.
We could have the ReflectionAttribute be aware of the original
reflector, so that

$reflection_class = new ReflectionClass(C::class);
$reflection_attribute = $reflection_class->getAttributes()[0];
assert($reflection_attribute->getOriginatingReflector() === $reflection_class);

and

#[\Attribute(\Attribute::TARGET_CLASS)]
class MyAttribute {
public function __construct() {
assert(\Attribute::getOriginatingReflector() instanceof
\ReflectionClass);
assert(\Attribute::getAttributeReflector() instanceof
\ReflectionAttribute);
}
}

But outside of that constructor scope it would fail:

\Attribute::getOriginatingReflector(); // Throws RuntimeExeption, or a
dedicated exception type.
new MyAttribute(); // Throws RuntimeException or a dedicated exception type.

The exact behavior here can be discussed.

--- Andreas

Hello internals,
I am picking up an idea that was mentioned by Benjamin Eberlei in the past.
https://externals.io/message/110217#110395
(we probably had the idea independently, but Benjamin's is the first
post where I see it mentioned in the list)

Quite often I found myself writing attribute classes that need to fill
some default values or do some validation based on the symbol the
attribute is attached to.
E.g. a parameter attribute might require a specific type on that
parameter, or it might fill a default value based on the parameter
name.

Currently I see two ways to do this:

Do the logic in the code that reads the attribute, instead of the
attribute class. This works ok for one-off attribute classes, but it
becomes quite unflexible with attribute interfaces, where 3rd parties
can provide their own attribute class implementations.

Add additional methods to the attribute class that take the symbol
reflector as a parameter, like "setReflectionMethod()", or
"setReflectionClass()". Or the method in the attribute class that
returns the values can have a reflector as a parameter.

Both of these are somewhat limited and unpleasant.

I want to propose a new way to do this.
Get some feedback first, then maybe an RFC.

The idea is to mark constructor parameters of the attribute class with
a special parameter attribute, to receive the reflector.
The other arguments are then shifted to skip the "special" parameter.

#[Attribute]
class A {
public function __construct(
public readonly string $x,
#[AttributeContextClass]
public readonly \ReflectionClass $class,
public readonly string $y,
) {}
}

$a = (new ReflectionClass(C::class))->getAttributes()[0]->newInstance();
assert($a instanceof A);
assert($a->x === 'x');
assert($a->class->getName() === 'C');
assert($a->y === 'y');

Note that for methods, we typically need to know the method reflector
and the class reflector, because the method could be defined in a
base class.

#[Attribute]
class AA {
public function __construct(
#[AttributeContextClass]
public readonly \ReflectionClass $class,
#[AttributeContextMethod]
public readonly ReflectionMethod $method,
) {}
}

class B {
#[AA]
public function f(): void {}
}

class CC extends B {}

$aa = (new ReflectionMethod(CC::class, 'f))->getAttributes()[0]->newInstance();
assert($a->class->getName() === 'CC');
assert($a->method->getName() === 'f');

Notice that the original proposal by Benjamin would use an interface
and a setter method, ReflectorAwareAttribute::setReflector().

I prefer to use constructor parameters, because I generally prefer if
a constructor creates a complete and immutable object.

Thoughts?

-- Andreas