Newsgroups: php.internals
Path: news.php.net
Xref: news.php.net php.internals:108542
User-Agent: Cyrus-JMAP/3.1.7-802-g7a41c81-fmstable-20200203v1
Mime-Version: 1.0
Message-ID: <431e0a2e-bc95-493f-9dde-a4a07000d76c@www.fastmail.com>
In-Reply-To: <f1a5537b-9b76-3852-83d0-c7189c750d11@freedom.nl>
References: 
 <CAD5P3=JwFK03L-DWJYaa0dbBj1chbef_=K-vyawLNeoAz7dHjA@mail.gmail.com>
 <CALSrLtYm+TSEwo+05JC6LXrQsfzb32pQ24Y7gDUmm-7+NwYi8w@mail.gmail.com>
 <CAOWwgpm43=hoCvsskq2qyhB3yVxntLAYC8hQjQwd9dnvQ=GKkg@mail.gmail.com>
 <CA+kxMuQgdS+N90PsPXPrzgOaebVS1OJUinJ=T+61PaMXcO+-GQ@mail.gmail.com>
 <f1a5537b-9b76-3852-83d0-c7189c750d11@freedom.nl>
Date: Thu, 13 Feb 2020 10:55:04 -0600
To: "php internals" <internals@lists.php.net>
Content-Type: text/plain
Subject: Re: [PHP-DEV] [RFC]
From: larry@garfieldtech.com ("Larry Garfield")

On Tue, Feb 11, 2020, at 1:08 PM, Dik Takken wrote:

> On 11-02-2020 19:46, Larry Garfield wrote:
> >
> > I would love a nicer way to reference function names; it's really ugly
> to do functional code in PHP otherwise, or even just dynamic function
> logic within a namespace.  If I never have to write $fn = __NAMESPACE__
> . '\a_func' again, it will be too soon. :-)
> 
> Perhaps Larry can convince us all to go for something like $() by
> posting a fabulous functional programming example?

I walked right into that one, didn't I...


Well, Dik asked me to post a "fabulous functional programming example".  I dont' have one, so I'll go with one from the book I'm working on instead. :-)

This is what I have now, assuming all global functions:

$result = Stats::of($string)
    ->analyze('normalizeNewlines')
    ->analyze('readingLevel')
    ->analyze('countStats')
    ->analyze(fn($s) => wordDistribution($s, 3))
;

I think we are all in agreement that is sub-optimal.  If using functions in the `Stats\Analyzer` namespace, you end up with this:

$result = Stats::of($string)
    ->analyze('\Stats\Analyzer\normalizeNewlines')
    ->analyze('\Stats\Analyzer\readingLevel')
    ->analyze('\Stats\Analyzer\countStats')
    ->analyze(fn($s) => \Stats\Analyzer\wordDistribution($s, 3))
;

Or if you're in that namespace (as I often am) you can do:

$ns = __NAMESPACE__;

$result = Stats::of($string)
    ->analyze($ns . '\normalizeNewlines')
    ->analyze($ns . '\readingLevel')
    ->analyze($ns . '\countStats')
    ->analyze(fn($s) => \Stats\Analyzer\wordDistribution($s, 3))
;

This is doubleplusungood.

I'll expand the example to use some methods, too, for variety's sake:


$result = Stats::of($string)
    ->analyze('normalizeNewlines')
    ->analyze('readingLevel')
    ->analyze('countStats')
    ->analyze([$synonymSuggester, 'analyze'])
    ->analyze([WordCouter::class, 'analyze'])
    ->analyze(fn($s) => wordDistribution($s, 3))
;


As I understand it, the goal in this thread is to:

1) Make code like the above more statically analyziable by not using strings. (That covers a number of areas.)
2) Make code like the above easier to write by using symbols rather than strings.

In each case, the analyze() method wants a callable with a specific signature.  Enforced callable signatures are *absolutely something we desperately need*, but are not the scope at the moment.  In this example, then, that a function name is a string is incidental; what we care about is that it's callable.  I don't have a good example off hand for it actually wanting a string, although strings are callables.

So, given this import block:

use Stats\Analyzer\normalizeNewlines;
use Stats\Analyzer\readingLevel;
use Stats\Analyzer\countStats;
use Stats\Analyzer\normalizeNewlines;
use Stats\Analyzer\wordDistribution;
use Stats\Analyzer\Stats;
use Stats\Analyzer\WordCounter;

here's the above example cast into a number of the different proposals floating about this thread:

::function

$result = Stats::of($string)
    ->analyze(normalizeNewlines::function)
    ->analyze(readingLevel::function)
    ->analyze(countStats::function)
    ->analyze([$synonymSuggester, 'analyze'])
    ->analyze([WordCouter::class, 'analyze'])
    ->analyze(fn($s) => wordDistribution($s, 3))
;

Analysis: I stand by my earlier statement that ::function is just too damned long for this funtionality.  Not when already reserved shorter options exist.

::fn

$result = Stats::of($string)
    ->analyze(normalizeNewlines::fn)
    ->analyze(readingLevel::fn)
    ->analyze(countStats::fn)
    ->analyze([$synonymSuggester, 'analyze'])
    ->analyze([WordCouter::class, 'analyze'])
    ->analyze(fn($s) => wordDistribution($s, 3))
;

Analysis: Same thing, less typing.  

::name

$result = Stats::of($string)
    ->analyze(normalizeNewlines::name)
    ->analyze(readingLevel::name)
    ->analyze(countStats::name)
    ->analyze([$synonymSuggester, 'analyze'])
    ->analyze([WordCouter::name, 'analyze'])
    ->analyze(fn($s) => wordDistribution($s, 3))
;

Analysis: This one gives us unification between classes and functions.  I think that would be good, although it's not a hard requirement for the functionality.


Any of the above could be expanded, potentially, to include methods, like so:

::name, for methods, too

$result = Stats::of($string)
    ->analyze(normalizeNewlines::name)
    ->analyze(readingLevel::name)
    ->analyze(countStats::name)
    ->analyze([$synonymSuggester, analyze::name])
    ->analyze([WordCouter::name, analyze::name])
    ->analyze(fn($s) => wordDistribution($s, 3))
;

However, that requires the parser being able to tell the difference between analyze::name inside a callable array and not.  That's... potentially complicated, since you can also just have an array that contains both objects and strings, or objects and callables, especially as objects can also be callables with __invoke().


$() or variations thereof

$result = Stats::of($string)
    ->analyze($(normalizeNewlines))
    ->analyze($(readingLevel))
    ->analyze($(countStats))
    ->analyze($($synonymSuggester, 'analyze'))
    ->analyze($(WordCouter::class, 'analyze'))
    ->analyze(fn($s) => wordDistribution($s, 3))
;

Analysis: I'm not sure I like this one, visually.  "($(..))" feels like a lot of sigils soup.  It also doesn't offer a way to deal with the method names in the object versions, unless we just assume that bare strings are allowed there, like so:

$result = Stats::of($string)
    ->analyze($(normalizeNewlines))
    ->analyze($(readingLevel))
    ->analyze($(countStats))
    ->analyze($($synonymSuggester, analyze))
    ->analyze($(WordCouter::class, analyze))
    ->analyze(fn($s) => wordDistribution($s, 3))
;

I will say that, given the behavior of ::class now, ::fn or ::name "feel like" they should return a string, whereas $() "feels like" it should return a callable, or something richer than a string.  That's naturally subjective but is consistent with ::foo being a constant value and $() being, um, jQuery.

Also suggested was this:

closure()

$result = Stats::of($string)
    ->analyze(closure(normalizeNewlines))
    ->analyze(closure(readingLevel))
    ->analyze(closure(countStats))
    ->analyze(closure($synonymSuggester->analyze))
    ->analyze(closure(WordCouter::analyze))
    ->analyze(fn($s) => wordDistribution($s, 3))
;

Which aside from being verbose really does look an awful lot like a function call, and is how I'd interpret it.  That said, a syntax that would recognize `$foo->bar` as a callable, not an invocation, and `Foo::bar` as a callable, would be really nice.


Of course... I feel compelled to ask why we can't just use bare function names.  Treating a bare string as a string has been deprecated for several versions.  If we remove that in PHP 8 and instead let it mean constant, then function, then class name, the following would become legal:

$result = Stats::of($string)
    ->analyze(normalizeNewlines)
    ->analyze(readingLevel)
    ->analyze(countStats)
    ->analyze([$synonymSuggester, analyze])
    ->analyze([WordCouter, analyze])
    ->analyze(fn($s) => wordDistribution($s, DISTRIBUTION_LIMIT))
;

Which would be much more in line with how many other languages handle symbol names.  (There is likely some engine reason why it's way harder than I make it sound; I'd love to hear what that is so I know not to suggest it again, unless it really is that simple in which case...)


In all, I don't think there's a clear obvious winner here.  All of the options have some kind of trade-off.  My own preferences lean toward:

1) Keep it short; we don't need more long words.
2) No more parentheses; in context, there will be ample parentheses already anywhere a function is being referenced as a callable, so let's not drift more toward LISP.  (This is why I don't think "well wrap it in fromCallable()" is a good answer.)
3) Handling method callables at the same time is *not* required, but would be really good for a complete solution.

--Larry Garfield